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Preface 


Differential equations and linear algebra are the two crucial courses in undergraduate 
mathematics. This new textbook develops those subjects separately and _ together. 
Separate is normal—these ideas are truly important. This book presents the basic course 
on differential equations, in full: 


Chapter 1 First order equations 

Chapter 2 Second order equations 

Chapter 3. Graphical and numerical methods 

Chapter4 Matrices and linear systems 

Chapter6 Eigenvalues and eigenvectors 
I will write below about the highlights and the support for readers. Here I focus on the 
option to include more linear algebra. Many colleges and universities want to move in 
this direction, by connecting two essential subjects. 

More than ever, the central place of linear algebra is recognized. Limiting a student to 
the mechanics of matrix operations is over. Without planning it or foreseeing it, my lifework 
has been the presentation of linear algebra in books and video lectures : 

Introduction to Linear Algebra (Wellesley—Cambridge Press) 

MIT OpenCourseWare (ocw.mit.edu, Mathematics 18.06 in 2000 and 2014). 
Linear algebra courses keep growing because the need keeps growing. At the same time, 
a rethinking of the MIT differential equations course 18.03 led to a new syllabus. 
And independently, it led to this book. 

The underlying reason is that time is short and precious. The curriculum for many 
students is just about full. Still these two topics cannot be missed—and linear differential 
equations go in parallel with linear matrix equations. The prerequisite is calculus, for a single 
variable only—the key functions in these pages are inputs f(t) and outputs y(t). 
For all linear equations, continuous and discrete, the complete solution has two parts: 


One particular solution y, Ayp = 


All null solutions y,, Ayn = 0 


Those right hand sides add to b+ 0 = 6b. The crucial point is that the left hand sides 
add to A(yp + Yn). When the inputs add, and the equation is linear, the outputs add. 
The equality A(yp + yn) = b+ O tells us all solutions to Ay = b: 


The complete solution to a linear equation is y = (one y,) + (all y,). 


Vv 
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The same steps give the complete solution to dy/dt = f(t), for the same reason. 
We know the answer from calculus—it is the form of the answer that is important here : 


d if 
= = f(t) is solved by Vth= F(a ax 
0 
d 
— — 0 is solved by Yn(t) = C (any constant) 
d 
= = f(t) iscompletely solvedby y(t) = yp(t)+C 


For every differential equation dy/dt = Ay + f(t), our job is to find y, and y,: 
one particular solution and all homogeneous solutions. My deeper purpose is to 
build confidence, so the solution can be understood and used. 


Differential Equations 


The whole point of learning calculus is to understand movement. An economy grows, 
currents flow, the moon rises, messages travel, your hand moves. The action is fast or 
slow depending on forces from inside and outside: competition, pressure, voltage, desire. 
Calculus explains the meaning of dy/dt, but to stop without putting it into an equation 
(a differential equation) is to miss the whole purpose. 


That equation may describe growth (often exponential growth e%’). It may describe os- 
cillation and rotation (with sines and cosines). Very frequently the motion approaches an 
equilibrium, where forces balance. That balance point is found by linear algebra, when the 
rate of change dy/dt is zero. 


The need is to explain what mathematics can do. I believe in looking partly outside 
mathematics, to include what scientists and engineers and economists actually remember 
and constantly use. My conclusion is that first place goes to linear equations. The essence 
of calculus is to linearize around a present position, to find the direction and the speed 
of movement. 


Section 1.1 begins with the equations dy/dt = y and dy/dt = y?. It is simply wonderful 
that solving those two equations leads us here: 


dy 1 1 
— = lf t 4? —t3 bodes — t 
y y a oa ap ay y=e 
dy = 248 = 
aoe y=14+t+?+t+... y=1/(1—-2) 


To meet the two most important series in mathematics, right at the start, that is pure 
pleasure. No better practice is possible as the course begins. 
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Important Choices of f(t) 


Let me emphasize that a textbook must do more than solve random problems. We could 
invent functions f(t) forever, but that is not right. Much better to understand a small number 
of highly important functions: 


f(t) = sinesand cosines (oscillating and rotating) 
f(t) = exponentials (growing and decaying) 
f(t)= l1fort >0 (a switch is turned on) 
f(t) = impulse (a sudden shock) 


The solution y(t) is the response to those inputs—frequency response, exponential 
response, step response, impulse response. These particular functions and particular 
solutions are the best—the easiest to find and by far the most useful. All other solutions 
are built from these. 

I know that an impulse (a delta function that acts in an instant) is new to most students. 
This idea deserves to be here! You will see how neatly it works. The response is like the 
inverse of a matrix—it gives a formula for all solutions. The book will be supplemented by 
video lectures on many topics like this, because a visual explanation can be so effective. 


Support for Readers 


Readers should know all the support that comes with this book: 


math.mit.edu/dela is the key website. The time has passed for printing solutions to 
odd-numbered problems in the back of the book. The website can provide more detailed 
solutions and serious help. This includes additional worked problems, and codes for nu- 
merical experiments, and much more. Please make use of everything and contribute. 


ocw.mit.edu has complete sets of video lectures on both subjects (OpenCourseWare 
is also on YouTube). Many students know about the linear algebra lectures for 18.06 and 
18.06 SC. I am so happy they are helpful. For differential equations, the 18.03 SC videos 
and notes and exams are extremely useful. 

The new videos will be about special topics—possibly even the Tumbling Box. 


Linear Algebra 


I must add more about linear algebra. My writing life has been an effort to present 
this subject clearly. Not abstractly, not with a minimum of words, but in a way that is 
helpful to the reader. It is such good fortune that the central ideas in matrix algebra 
(a basis for a vector space, factorization of matrices, the properties of symmetric and 
orthogonal matrices), are exactly the ideas that make this subject so useful. Chapter 5 
emphasizes those ideas and Chapter 7 explains the applications of AT A. 

Matrices are essential, not just optional. We are constantly acquiring and organizing 
and presenting data—the format we use most is a matrix. The goal is to see the relation 
between input and output. Often this relation is linear. In that case we can understand it. 
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The idea of a vector space is so central. Take all combinations of two vectors or two 
functions. I am always encouraging students to visualize that space—examples are really 
the best. When you see all solutions to v; + v2 + v3 = 0 and d*y/dt? + y = 0, you 
have the idea of a vector space. This opens up the big questions of linear independence and 
basis and dimension—by example. 

If f(t) comes in continuous time, our model is a differential equation. If the input comes 
in discrete time steps, we use linear algebra. The model predicts the output y(t) this is 
created by the input f(t). But some inputs are simply more important than others—they are 
easier to understand and much more likely to appear. Those are the right equations to present 
in this course. 


Notes to Faculty (and All Readers) 


One reason for publishing with Wellesley-Cambridge Press can be mentioned here. 
I work hard to keep book costs reasonable for students. This was just as important for 
Introduction to Linear Algebra. A comparison on Amazon shows that textbook prices 
from big publishers are more than double. Wellesley-Cambridge books are distributed by 
SIAM inside North America and Cambridge University Press outside, and from Wellesley, 
with the same motive. Certainly quality comes first. 


I hope you will see what this book offers. The first chapters are a normal textbook 
on differential equations, for a new generation. The complete book is a year’s course 
on differential equations and linear algebra, including Fourier and Laplace transforms— 
plus PDE’s (Laplace equation, heat equation, wave equation) and the FFT and the SVD. 

This is extremely useful mathematics! I cannot hope that you will read every word. 
But why should the reader be asked to look elsewhere, when the applications can come 
so naturally here? 

A special note goes to engineering faculty who look for support from mathematics. I 
have the good fortune to teach hundreds of engineering students every year. My work with 
finite elements and signal processing and computational science helped me to know what 
students need—and to speak their language. I see texts that mention the impulse response 
(for example) in one paragraph or not at all. But this is the fundamental solution from 
which all particular solutions come. In the book it is computed in the time domain, starting 
with e®, and again with Laplace transforms. The website goes further. 

I know from experience that every first edition needs help. I hope you will tell me what 
should be explained more clearly. You are holding a book with a valuable goal—to become 
a textbook for a world of students and readers in a new generation and a new time, with 
limits and pressing demands on that time. The book won’t be perfect. I will be so grateful 
if you contribute, in any way, to making it better. 
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1.3 Solve dy/dt = ay Construct the exponential e™ 
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The key formula in 1.4 gives the solution y(t) = e*¢y(0) + { e%—*) q(s)ds. 
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The website with solutions and codes and extra examples and videos is math.mit.edu/dela 
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Chapter 1 


First Order Equations 


1.1 Four Examples: Linear versus Nonlinear 


A first order differential equation connects a function y(t) to its derivative dy/dt. 
That rate of change in y is decided by y itself (and possibly also by the time f). 


Here are four examples. Example 1 is the most important differential equation of all. 


dy dy dy dy 
2) —=- 3) — =2t 4) —=y? 
dt 7 as ae oe ) ee 


Those examples illustrate three linear differential equations (1, 2, and 3) and a 
nonlinear differential equation. The unknown function y(t) is squared in Example 4. 
The derivative y or —y or 2ty is proportional to the function y in Examples 1, 2, 3. 
The graph of dy/dt versus y becomes a parabola in Example 4, because of y?. 

It is true that ¢ multiplies y in Example 3. That equation is still linear in y and dy/dt. 
It has a variable coefficient 2t, changing with time. Examples 1 and 2 have constant 
coefficient (the coefficients of y are 1 and —1). 


Solutions to the Four Examples 


We can write down a solution to each example. This will be one solution but it is not 
the complete solution, because each equation has a family of solutions. Eventually there will 
be a constant C’ in the complete solution. This number C’ is decided by the 
starting value of y at t = 0, exactly as in ordinary integration. The integral of f(t) solves the 
simplest differential equation of all, with y(0) = C: 


d t 
5) St = f(t) The complete solution is y(t) =| f(s)ds+C . 
0 
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For now we just write one solution to Examples 1 — 4. They all start at y(0) = 1. 


d 
tees y _issolvedby y(t) = ef 
dt 
d 
Dee = —y issolvedby y(t) = ent 
dt 
d 
3 = = 2ty issolvedby y(t) = et? 
d 1 
4 = =y’ issolvedby y(t) = a, 


Notice: The three linear equations are solved by exponential functions (powers of e). 
The nonlinear equation 4 is solved by a different type of function; here it is 1/(1 — t). 
Its derivative is dy/dt = 1/(1 — t)?, which agrees with y?. 

Our special interest now is in linear equations with constant coefficients, like 1 and 2. 
In fact dy/dt = y is the most important property of the great function y = e’. Calculus 
had to create e’, because a function from algebra (like y = t”) cannot equal its derivative 


(the derivative of t” is nt”~!). But a combination of all the powers ¢” can do it. That 
good combination is e* in Section 1.3. 


The final example extends 1 and 2, to allow any constant coefficient a: 


d 
6) Ft =ay issolvedby y=e%t (andalso y= Ce). 
If the constant growth rate a is positive, the solution increases. If a is negative, as in 
dy/dt = —y with a = —1, the slope is negative and the solution e~* decays toward zero. 
Figure 1.1 shows three exponentials, with dy/dt equal to y and 2y and —y. 


0 t 


1 t 


Figure 1.1: Growth, faster growth, and decay. The solutions are et and ec! and e~?. 
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When a is larger than 1, the solution grows faster than e*. That is natural. The neat thing 
is that we still follow the exponential curve—but e® climbs that curve faster. You could see 
the same result by rescaling the time axis. In Figure 1.1, the steepest curve 
(for a = 2) is the same as the first curve—but the time axis is compressed by 2. 

Calculus sees this factor of 2 from the chain rule for e?*. It sees the factor 2¢ from 
the chain rule for e’”. This exponent is t?, the factor 2t is its derivative : 


ca (e“) =e" ee a (e?") = (e*) times 2 < (c*) = (c’”) times 2¢ 


Problem Set 1.1: Complex Numbers 


1 Draw the graph of y = e! by hand, for —1 < t < 1. What is its slope dy/dt at 
t = 0? Add the straight line graph of y = et. Where do those two graphs cross ? 


2 Draw the graph of y; = e”* on top of y2 = 2e*. Which function is larger at t = 0? 
Which function is larger at t = 1? 


What is the slope of y= e~* at t = 0? Find the slope dy/dt at t = 1. 
What “logarithm” do we use for the number ¢ (the exponent) when e’ = 4? 


State the chain rule for the derivative dy/dt if y(t) = f(u(t)) (chain of f and u). 


oOo ao SF wo 


The second derivative of e* is again e’. So y = e! solves d?y/dt? = y. A sec- 
ond order differential equation should have another solution, different from y = Cet. 
What is that second solution ? 


7 Show that the nonlinear example dy/dt = y” is solved by y = C/(1 — Ct) 
for every constant C’. The choice C = 1 gave y = 1/(1 — 1), starting from y(0) = 1. 


8 Why will the solution to dy/dt = y? grow faster than the solution to dy/dt = y 
(if we start them both from y = 1 at t = 0)? The first solution blows up at t = 1. 
The second solution e* grows exponentially fast but it never blows up. 


9 ‘Find a solution to dy/dt = —y? starting from y(0) = 1. Integrate dy/y? and —dt. 
(Or work with z = 1/y. Then dz/dt = (dz/dy) (dy/dt) = (—1/y?)(—y?) = 1. 
From dz/dt = 1 you will know z(t) and y = 1/z.) 

10 Which of these differential equations are linear (in y) ? 

(a) y’ + siny =t (b) y’=(y—-t) (c) y’+ety=tl?. 

11. The product rule gives what derivative for e’e~‘ ? This function is constant. At t = 0 

this constant is 1. Then e’e~* = 1 for all t. 


12 dy/dt = y + 1is not solved by y = e! + t. Substitute that y to show it fails. We can’t 
just add the solutions to y’ = y and y’ = 1. What number c makes y = e’ + c intoa 
correct solution ? 
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1.2 The Calculus You Need 


The prerequisite for differential equations is calculus. This may mean a year or more of 
ideas and homework problems and rules for computing derivatives and integrals. Some of 
those topics are essential, but others (as we all acknowledge) are not really of first impor- 
tance. These pages have a positive purpose, to bring together essential facts of calculus. 
This section is to read and refer to—it doesn’t end with a Problem Set. 

I hope this outline may have value also at the end of a single-variable calculus course. 
Textbooks could include a summary of the crucial ideas, but usually they don’t. Certainly 
the reader will not agree with every choice made here, and the best outcome would be a more 
perfect list. This one is a lot shorter than I expected. 

At the end, a useful formula in differential equations is confirmed by the product rule, 
the derivative of e”, and the Fundamental Theorem of Calculus. 


1. Derivatives of key functions: 2x” sinz cosx e% Inzx 


The derivatives of x, 2?,x°,... come from first principles, as limits of Ay/Az. The 
derivatives of sina and cosz focus on the limit of (sin Ax)/Az. Then comes the great 
function e”. It solves the differential equation dy/dz = y starting from y(0) = 1. 
This is the single most important fact needed from calculus: the knowledge of e”. 


2. Rules for derivatives: Sumrule Productrule Quotientrule Chain rule 


When we add, subtract, multiply, and divide the five original functions, these rules give the 

derivatives. The sum rule is the quiet one, applied all the time to linear differential equations. 

This equation is linear (a crucial property) : 
d dz d 

= =ay+ f(t) and oa az + g(t) add to at) Says 2) (fF +9). 
With a = 0 that is a straightforward sum rule for the derivative of y + z. We can always 
add equations as shown, because a(t)y is linear in y. This confirms superposition of the 
separate solutions y and z. Linear equations add and their solutions add. 

The chain rule is the most prolific, in computing the derivatives of very remarkable func- 
tions. The chain y = e® and x = sint produces y = esint (the composite of two 
functions). The chain rule gives dy/dt by multiplying the derivatives dy/dx and dx/dt : 

dy _ dy dz 


Chain rul —=— =e” cost = y cost. 
ain rule dt Ae e y COs 


: d 
Then eS? solves that differential equation = = ay with varying growth rate a = cost. 
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3. The Fundamental Theorem of Calculus 


The derivative of the integral of f(a) is f(x). The integral from 0 to z of the derivative 
df /dx is f(a) — f(0). One operation inverts the other, when f(0) = 0. This is not so easy 
to prove, because both the derivative and the integral involve a limit step Ax —> 0. 

One way to go forward starts with numbers yo, y1,...,Yn- Their differences are like 
derivatives. Adding up those differences is like integrating the derivative : 


Sum of differences (yi — yo) + (y2 —y1) + °° * +(Yn —Yn—1) =Yn— Yo: (1) 


Only y,, and —yo are left because all other numbers yj, y2,... come twice and cancel. 
To make that equation look like calculus, multiply every term by Ax/Az = 1: 


[ui Yo , y27yi Yn — Yn-1 

vee At = Yn — Yo- 2 
[| Ax zi Az neers Ax ee ce @) 
Again, this is true for all numbers yo, y1,...,%n-. Those can be heights of the graph of a 
function y(z). The points xo, ..., Zp» can be equally spaced between x = a and x = b. Then 


each ratio Ay/Az is a slope between two points of the graph: 


Ay — Yk —Yk-1 ___ distance up 


= —————_ = slope. 3 
Av 2&p—2X,p-1 distance across P (3) 


This slope is exactly correct if the graph is a straight line between the points x,_1 and 7x. 
If the graph is a curve, the approximate slope Ay/Az becomes exact as Ax —> 0. 

The delicate part is the requirement nAz = b — a, to space the points evenly 
from zo = ato x, = b. Then n will increase as Ax decreases. Equation (2) remains 
correct at every step, with yo = y(a) at the first point and y,, = y(b) at the last point. 
As Ax — 0 and n — oo, the slopes Ay/Az approach the derivative dy/dx. At the 
same time the sum approaches the integral of dy/dx. Equation (2) turns into equation (4): 


Fundamental 3 AF d fe 
Theorem JZ dx = y(b) — y(a) Ah f(s)ds= f(x) (4 
of Calculus dz Ja 


The limits of Ay/Az in (3) and the sum in (2) produce dy/dz and its integral. Of course 
this presentation of the Fundamental Theorem needs more careful attention. But equation 
(1) holds a key idea: a sum of differences. This leads to an integral of derivatives. 


4. The meaning of symbols and the operations of algebra 


Mathematics is a language. The way to learn this language is to use it. So textbooks have 
thousands of exercises, to practice reading and writing symbols like y(x) and y(a + Az). 
Here is a typical line of symbols : 


t+ At) — y(t 
Derivative of y ra = lim DME auth) (5) 


At—+0 At 


I am not very sure that this is clear. One function is y, the other function is its derivative y’. 
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Could the symbol y' be better than dy/dt? Both are standard in this book. In calculus 
we know y(t), in differential equations we don’t. The whole point of the differential equation 
is to connect y and y’. From that connection we have to discover what they are. 

A first example is y’ = y. That equation forces the unknown function y to grow expo- 
nentially: y(t) = Ce‘. At the end of this section I want to propose a more complicated 
equation and its solution. But I could never find a more important example than e’. 


5. Three waystouse dy/dx = Ay/Az 


On the graph of a function y(z), the exact slope is dy/dx and the approximate slope 
(between nearby points) is Ay/Az. If we know any two of the numbers dy/dx and 
Ay and Az, then we have a good approximation to the third number. All three approxi- 
mations are important, because dy/dz is such a central idea in calculus. 


(A) When we know Az and dy/dz, we have Ay = (Azx)(dy/dz). 


This is linear approximation. From a starting point zp, we move a distance Ax. That 
produces a change Ay. The graph of y(x) can go up or down, and the best information 
we have is the slope dy/dx at xo. (That number gives no way to account for bending of the 
graph, which appears in the next derivative d?y/dzx?.) 

Linear approximation is equivalent to following the tangent line —not the curve: 


di 
Ay = Ag . y(to + Ax) © y(xo) + Ae (a0) (6) 


(B) Ayand dy/dz leadto Ax = (Ay)/(dy/dz). This is Newton’s Method. 


Newton’s Method is a way to solve y(x) = 0, starting at a point zp. We want y(z) to 
drop from y(9) to zero at the new point 2,. The desired change in y is Ay = 0 — y(2o). 
What we don’t know is Az, which locates x;. The exact slope dy/dz will be close to 
Ay/Az, and that tells us a good Az: 


Ay i = y(20) 
1 10 “dy/dx(a0) 


Guess 29, improve to 71. This is an excellent way to solve nonlinear equations y(x) = 0. 


Newton’s Method Ag = 


~ ay/de s 


(C) Dividing Ay by Az gives the approximation dy/dz ~ Ay/Az. 
That is the point of equation (5), but something important often escapes our attention. 
Are x and x + Az the best two places to compute y? Writing Ay = y(x + Az) — y(z) 
doesn’t seem to offer other choices. If we notice that Az can be negative, this allows 


x + Az to be on the left side of x (leading to a backward difference) . The best choice 
is not forward or backward but centered around x: a half step each way. 


y Ay  y(a2+ 5 Az) — y(w — 3Az) (8) 
dx Ar Az 


Centered difference 
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Why is centering better? When y = Cx + D has a straight line graph, all ratios 
Ay/Az give the correct slope C. But the parabola y = 2? has the simplest possible 
bending, and only this centered difference gives the correct slope 2z (varying with x). 


Bact slope Ay _ («©+ Ac)? -(c¢-$Ar)? — xAr—-(-xAz) _ 
for parabolas = = = 2a 
b ; Ag Az Ax 

y centering 


The key step in scientific computing is improving first order accuracy (forward differences) to 
second order accuracy (centered differences). For integrals, rectangle rules improve 
to trapezoidal rules. This is a big step to good algorithms. 


6. Taylor series: Predicting y(z) from all the derivatives at x = xo 


From the height yo and the slope yj at xo, we can predict the height y(a) at nearby points. 
But the tangent line in equation (6) assumes that y(x) has constant slope. That first order 
prediction becomes a second order prediction (much more accurate) when we use 
the second derivative yj at Zo. 


Tangent parabola using yj y(xo + Ax) © yo + (Az)yg + $(Ax)?yg. (9) 
Adding this (Az)? term moves us from constant slope to constant bending. For the 
parabola y = x”, equation (9) is exact: (to + Ax)? = (23) + (Az)(2r9) + $(Azx)?(2). 


Taylor added more terms—infinitely many. His formula gets all derivatives correct 
at xo. The pattern is set by (Ax)? 4g . The n'2 derivative y™ (x) contributes a new 


term 4 (Az)ryo”. The complete Taylor series includes all derivatives at the point x = x: 


1 n 
Taylor series y(zp +Axv) = yo + (Az)yg +--+ —(Ac)" yf evens 
tf ° n ‘ 
Stop at Peo tangent line a > (Ax) y (29) aos 
Stop at y” for parabola n=0 ! 


Those equal signs are not always right. There is no way we can stop y(x) from making a 
sudden change after z moves away from xo. Taylor’s prediction of y(ao + Az) is exactly 
correct for e* and sin x and cos x—good functions like those are “analytic” at all x. 


Let me include here the two most important examples in all of mathematics. They are 
solutions to dy/dx = y and dy/dx = y? — the most basic linear and nonlinear equations. 


1 1 
Exponential series with y)(0)=1 y=e*=1+2+ rT ge 31 g++. (11) 
1 
Geometric series with y (0) =n! y= ae ltae+a?+a34 ++. (12) 
—2 


The center point is x7 = O. The series (11) gives e” for every x. The series (12) gives 
1/(1—2) when z is between —1 and 1. Its derivative 1 + 2x + 327 + --- is 1/(1 — 2)?. 
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For c = 2 that geometric series will certainly not produce 1/(1 — 2) = —1. Notice 
that 1 + 2 + 2? +--+ becomes infinite at z = 1, exactly where 1/(1 — x) becomes 1/0. 


The key point for e” is that its n") derivative is 1 at c = 0. The n" derivative of 
1/(1 — x) isn! at x = 0. This pattern starts with y, y’, y”, y’” equal to 1,1,2,6 atx =0: 


y=(l-s) y= (-a)? y= 20-2) Fo” = 61-2) *. 
Taylor’s formula combines the contributions of all derivatives at « = 0, to produce y(z). 


7. Application: An important differential equation 


The linear differential equation y’ = ay + q(t) is a perfect multipurpose model. It 
includes the growth rate a and the external source term q(t). We want the particular 
solution that starts from y(0) = 0. Creating that solution uses the most essential idea 


behind integration. Verifying that the solution is correct uses the basic rules for derivatives. 
Many students in my graduate class had forgotten the derivative of the integral. 
Here is the solution y(t) followed by its interpretation, with a = 1 for simplicity : 


t 


d 
= =y+q(t) is solved by y(t) = [em eals) ds. (13) 
0 


Key idea: At each time s between 0 and ¢, the input is a source of strength q(s). That input 
grows or decays over the remaining time t — s. The input g(s) is multiplied by e*—* 
to give an output at time ¢. Then the total output y(t) is the integral of e*~*q(s). 


We will reach y(t) in other ways. Section 1.4 uses an “integrating factor’ Section 1.6 
explains “variation of parameters.’ The key is to see where the formula comes from. 
Inputs lead to outputs, the equation is linear, and the principle of superposition applies. 
The total output is the sum (in this case, the integral) of all those outputs. 

We will confirm formula (13) by computing dy/dt. First, e'~* equals e* times e7’. 
Then e’ comes outside the integral of e~*q(s). Use the product rule on those two factors : 


t t 

S dy oe de’ -s : t d —s . » 

Producing y + q = = (=) [e q(s) ds + (e*) = Je q(s)ds. (14) 
0 0 


The first term on the right side is exactly y(t). How to recognize that last term as q(t) ? 

We don’t need to know the function q(t). What we do know (and need) is the Fun- 
damental Theorem of Calculus. The derivative of the integral of e~*tgq(t) is e~*q(t). 
Then multiplying by e° gives the hoped-for result q(t), because e*e~* = 1. The linear 
differential equation y’ = y + q with y(0) = 0 is solved by the integral of e’~*q(s). 
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1.3 The Exponentials e* and e** 


Here is the key message from this section: The solutions to dy/dt = ay are y(t) = Ce. 
That free constant C matches the starting value y(0). Then y(t) = y(0)e%. 

I realize that you already know the function y = e'. It is the star of precalculus 
and calculus. Now it becomes the key to linear differential equations. Here I focus on the 


two most important properties of this function e’ : 


1. The slope dy/dt equals the function y. As y grows, its graph gets steeper : 


et = et. (1) 


&|& 


2. y(t) = e° follows the addition rule for exponents : 


e’ times e7 equals et? . (2) 


How is this exponential function constructed? Only calculus can do it, because 
somewhere we must have a “limit step.” Functions from ordinary algebra can get close 
to e’, but they can’t reach it. If we choose those functions to come closer 
and closer, then their limit is e?. 

This is like using fractions to approach the extraordinary number 7. The fractions 
can start with 3/1 and 31/10 and 314/100. The neat fraction 22/7 is close to 7. But 
“taking the limit” can’t be avoided, because 7 itself is not a fraction. 

Similarly e is not a fraction. On this book’s home page math.mit.edu/dela is an 
article called Introducing e*. It describes four popular ways to construct this function. 
The one chosen now is my favorite, because it is the most direct way. 


d 
Construct y =e! so that = = y (starting from y = 1 att = 0) 


To show how this construction works, here are ordinary polynomials y and dy/dt : 
1 
1 ySltis+ ae The derivative is dy/dt =O0+1+t 
1 1 Hee ade asd 1, 
2. y=lttt+ ae + ae The derivative is dy/dt =O0+1+t+ a 


You see that dy/dt does not fully agree with y. It always falls one term short of y. 
We could get ¢?/6 into the derivative by including t*/24 in y. But now dy/dt will be 
missing ¢t4/24. 

You can see that dy/dt won’t catch up to y. The way out is to have infinitely many terms : 
Don’t stop. Then you get dy/dt = y. 
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The limit step reaches an infinite series, adding new terms and never stopping. Every 
term has the form t” divided by n! (n factorial). Its derivative is the previous term: 


t” - ” ; tv-1 - tr-1 (3) 
Gay et ° Goi. mol! 
So if t"/n! is missing in dy/dt, we will capture it by including t”*!/(n + 1)! in y. 


The derivative of 


Of course dy/dt never completely catches up to y—until we allow an infinite series. 
There is a term t" /n! for every n. The term for n = 0 is t?/0! = 1. 


t? t3 t4 co 47 
Construction of e' pet te ea age -_ a 


Taking the derivative of every term produces all the same terms. So dy/dt = y. 
Notice : If you change every t to at, the derivative of y = e° becomes a times e% : 
242 343 242 
g(itats + +) a(r4art Se +---) act (5) 

This construction of e* brings up two questions, to be discussed in the Chapter 1 Notes. 
Does the infinite series add to a finite number (a different number for each choice of t)? 
Can we add the derivatives of each t"/n! and safely get the derivative of the sum e’ ? 
Fortunately both answers are yes. The terms get very small, very fast, as n increases. 
The limiting step is n + 00, producing the exact e°. 

When t = 1, we can watch the terms get small. We must do this, because t = 1 leads to 
the all-important number e! which is e: 


ia 1 
The series for e att = 1 Slab Dalia arin yt iene 


The first three terms add to 2.5. The first five terms almost reach 2.71. We never reach 2.72. 
With enough terms you can barely pass 2.71828. It is certain that the total sum e is not a 
fraction. It never appears in algebra, but it is the key number for calculus. 


t 


The Series for e” is a Taylor Series 


The infinite series (4) for e¢ is the same as the Taylor series. Section 1.2 went from the 
tangent line 1 + t to the tangent parabola 1 + t + 3t?. The next term will be 4t3, because 
that matches the third derivative y’” = 1 at t = 0. All derivatives are equal to 1 att = 0, 
when we start from the basic equation y’ = y. That equation gives y” = y’ = y and 
the next derivative gives y!” = y"” = y/ = y. 

Conclusion: t”/n! has the correct n™ derivative (which is 1) at the point t = 0. 
All these terms go into the Taylor series. The result is exactly the exponential series (4). 
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Multiplying Powers by Adding Exponents 


We write 3? for 3 times 3. We write ec for e times e. The question is, does e = 2.718... 
times e = 2.718... give the same answer as setting ¢ = 2 in the infinite series to get e? ? 
The answer is again yes. I could say “fortunately yes” but that might suggest a 
lucky accident. The amazing fact is that Property 1 (y’ = y is now confirmed) leads 
automatically to Property 2. The exponential starts from y(0) = e° = 1 at time t = 0. 


Property 2. — e* times e7 equals et 7 so (e+) (e!) =e? 
This is a differential equations course, so the proofs will use Property 1: dy/dt = y. 


First Proof. We can solve y’ = (a + b)y two ways, starting from y(0) = 1. We know that 


y(t) = e(¢+4)t, Another solution is y(t) = e*e®*, as the product rule shows: 


d 

a (e%te™) = (ae**) elt a ett (be”*) ~s (a “i bette”. (6) 
This solution ee also starts at e°e° = 1. It must be the same as the first solution e(¢+®)*, 
The equation y’ = (a + b)y only has one solution. At t = 1 this says that e*t® = e%e?, 


QED. 


Second Proof. Starting with y = 1 at t = 0, the solution out to time t is e’. The 
solution to time t + T is e*+7. The question is, do we also get that answer in two steps ? 


Starting from y = 1 at t = 0, we goto e’. Then start from e’ attime tandcontinue —_an 
additional time T. This would give e7 starting from y = 1, but here the starting _value is 
e'. SoC = e* multiplies e”. At time t + T we have perfect agreement: 


e' times e7 (which is C times e” ) agrees with one big step e’*7. 


Negative Exponents 


Remember the example dy/dt = —y with solution y = e~*. That exponent —t is negative. 
The solution decays toward zero. The exponent rule e’e? = et? still holds for 
negative exponents. In particular e’ times e~* is e*—* = e° = 1: 
1 uy 1 =I Meee Teens 
Negati t =i d= =1-14+--3+=—-::: 
egative exponents =) e ” an : e ale op oF om 
This number 1/e is about .36. The series always succeeds! The graph of y = e~? 


shows that e~* stays positive. It is very small for t > 32. Your computer might use 32 
bit arithmetic and ignore numbers that are this small. 

Why does e® grow so fast? The slope is y itself. So the slope increases when the 
function increases. That steep slope makes y increase faster—and then the slope too. 
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Interest Rates and Difference Equations 


There is another approach to e* and e*’, which is not based on an infinite series. (At least, 
not at the start.) It connects to interest on bank accounts. For e’ the rate is a = 1 = 100%. 
For e% the differential equation is dy/dt = ay and the interest rate is a. 


t 


The different approach is to construct e' and e“ as the limit of compound interest. 


bic ae 


ef = limit (1+ — e% = limit . (7) 
N—-> co N N—> co 


The beauty of these formulas is that a bank does exactly what a computational scientist 
does. They both start with the differential equation dy/dt = ay and the initial condition 
y = 1 att = 0. Banks and scientists don’t have computers that give exact solutions, when 
y(t) changes continuously with time. Both take finite time steps At instead of infinitesimal 
steps dt. They reach time t in N steps of size At = t/N. Their approximations are 


Yi, Yo,..., Yw with Yo = 1. Compound interest produces a difference equation : 
d Ynt+i — Yn 
7 =ay becomes ee =aY, and Ynj4i1=(1+aAt)Y,. (8) 


Each step multiplies the bank balance by 1 + aAt. The new balance is the old balance 
Yn plus a AtY,, (the interest on Y,, in the time interval At). This is ordinary compound 
interest that all banks offer, not continuous compounding as in dy/dt. The time step 
can be At = 1 year or 1 month. The balance at t = 2 years = 24 months is Y2 or Yaa: 


a \24 

Yo=(1+a)?¥% Yu=(1+5) Yore™Yo. (9) 
If the rate is a = 3 per cent per year = .03 per year, continuous compounding for 2 years 
would produce the exponential factor e°°° ~ 1.06184. Monthly compounding produces 
(1.0025)*4 = 1.06176. We only lose a little, when the differential equation y’ = ay is 
approximated by the difference equation in (8). 


The computational scientist is usually not willing to accept this loss of accuracy in Y. 
Equation (8) with a forward difference Y,41 — Yn is called Euler’s method. 
Its accuracy is not high and not hard to improve. It is the natural choice for a bank, 
because a backward difference costs them even more than continuous compounding: 


Yn .? Yn- 1 1 
ot = aY, of Yn = ——— 

At ‘ - 1— aAt 
Y;, connects backward to the earlier Y,,_;. Now each step divides by 1— aAt. After N steps 
of size At = t/N, we are again close to e*, But with backward differences and a > 0, we 
overshoot the differential equation and the bank pays a little too much: 


Backward difference \ nee (10) 


1 


(— ayy is above ent. 


(1+ aAt)™ is below e® 
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Complex Exponents 


This isn’t the time and place to study complex numbers in detail. It will be the pages 
about oscillations and e“”’ that cannot go forward without the imaginary number i. 
Here we are solving dy/dt = ay, and all I want to do is to choose a = 1. 

I can think of two ways to solve the complex equation dy/dt = iy. The fast way uses 
derivatives of sine and cosine, which we know well: 


Proposed solution y = cost + zsint (11) 
Compare dy /dt dy/dt = —sint + 7 cost 
with the right side iy iy = i cost + i? sint 


To check dy/dt = iy, compare the last two lines. Use the rule i? = —1. (We had 
to imagine this number, because no real number has v= —1.) Then —sint is the same 
as i7sint. So y = cost + isint solves the equation dy/dt = iy. This solution starts 
at y = 1 when t = 0, because cos 0 = 1 and sin0 = 0. 

The slower approach to dy/dt = iy uses the infinite series. Since a = i, the solution e% 
becomes e**. Formally, the series for y = e” certainly solves dy/dt = iy: 


; : Lire 1 
Complex exponential y=e*=1+4 (it) + 5 Lit)? + git) eee (12) 
The derivative of each term is 7 times the previous term. Since the series never stops, the 
derivative dy/dt perfectly matches iy. And we are still starting at y = 1 when we substitute 
t = 0. This infinite series e** equals the first solution cos t + isin t. 
—1. For (it)? I will write —t?. And (it)? equals —it?. The 


Now use the rule i? = 
i272 = (—1)? = 1. That sequence i, —1, —i, 1 repeats forever. 


fourth power of i is i+ 


i=? P=i® =-] Pat =-i w=8=1 


The infinite series (12) includes those four numbers multiplying powers of t: 


4a. ley it 2 i we it A if 
e- =1+)a¢- a ¢3rt a + i ar ae a a 


This may be the first time a textbook has ever written out nine terms. You can see the 
full repeat of 7, —1, —i, 1. That last coefficient divides by 8! = 8-7-6-5-4-3-2-1 
which is 40320. 

The main point is that the solution y = cost +7 sin¢ in equation (11) must be the same 
as this series solution e”’. They both solve dy/dt = iy. They both start at y = 1 whent = 0. 
The equality between them is one of the greatest formulas in mathematics. 


Euler’s Formula is e* = cost +i sint. (13) 


; 1 
Then e*” = cosa +i sina = —1. Ande?" = 14 i274 5 (iam) +--+ must add to 1! 
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I cannot resist comparing cost + isin¢ with the series for e’*. The real part of that 
series must be cost. The imaginary part (which multiplies 7) must be sint. The even 


powers 1, t?, t*, ... give cosines. The odd powers t, t®, t°, ... are multiplied by 7: 
Cosine is even t 1 lia + | a + (14) 
cost =1— = — —_— Boe 
2 24 6! 
Sine is odd int ee Pape a + (15) 
sint = t — = —f? i vee 
6 120 7! 


These two pieces of the series for et are famous functions on their own, and now we see 
their Taylor series . They are beautifully connected by Euler’s Formula. 
The derivative of the sine series is the cosine series : 
1 


d d 1 
ag Sint = cost Sp iat ete Ls ek gee cosine 


The derivative of the cosine series is minus the sine series : 
1 


d 1 1 
aes dee we iat em gear = — sine 


we t——sint cua 
at cos sin ie 3 24 


All this important information came from allowing the exponent in e” to be imaginary. 
And e** times e~ * is exactly cost + sin?t = 1. 


Matrix Exponents 


One more thing, which you can safely ignore for now. The exponent in e** could become 
a square matrix. Instead of solving dy/dt = ay by e*, we can solve the matrix equation 
dy/dt = Ay by the matrix e4*. Start with the identity matrix J instead of the number 1. 


1 1 
et is a matrix eAt ~ 74 At+ 5 (At)? Es 5 (At) nied (16) 


The series has the usual form, with the matrix A instead of the number a. Here I stop, 
because matrices come in Chapter 4: Systems of Equations. When the matrix A is three by 
three, the equation dy/dt = Ay represents three ordinary differential equations. Still first 
order linear, still constant coefficients, solved by e4t in Section 6.4. 

There is one big difference for matrices: e4te®t — e(At+)t is not true. For 
numbers a and 6b this equation is correct. For matrices A and B something goes wrong 
in equation (6). When you look closely, you see that 6 moved in front of e%. 
But e4*B = Be4+ is false for matrices. 
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= REVIEW OF THE KEY IDEAS #® 


1. In the series for e’, each term t” /n ‘is the derivative of the next term. 


2. Then the derivative of e* is e*, and the exponent rule holds: e* e? = e’+7. 


3. Another approach to dy/dt = y is by finite differences (Yn41 — Yn)/At = Yn. 
Yn+1 = Yn + AtY,, is the same as compound interest. Then Y,, is close to erAtyy. 


4. y = e® solves y’ = ay, and a = i leads to e** = cost + isint (Euler’s Formula). 


5. cost = 1—t?/2+--- andsint = t — t?/6 +--- are the even and odd parts of e**. 


Problem Set 1.3 


1 Set t = 2 in the infinite series for e?. The sum must be e times e, close to 7.39. 
How many terms in the series to reach a sum of 7 ? How many terms to pass 7.3 ? 


2 Starting from y(0) = 1, find the solution to dy/dt = y at time t = 1. Starting from 
that y(1), solve dy/dt = —y to time t = 2. Draw a rough graph of y(t) from 
t = 0 tot = 2. What does this say about e~! times e ? 


3 Start with y(0) = $5000. If this grows by dy/dt = .02y until t = 5 and then jumps to 
a = .04 per year until t = 10, what is the account balance at t = 10? 


4 Change Problem 3 to start with $5000 growing at dy/dt = .04y for the first five years. 
Then drop to a = .02 until t = 10. What is now the balance at t = 10? 


Problems 5-8 are about y = e and its infinite series. 


5 Replace t by at in the exponential series to find e®* : 


1 1 
Ce lathe Gira Mat nats 


Take the derivative of every term (keep five terms). Factor out a to show that 
the derivative of e“* equals ae“. At what time T does e* reach 2? 


6 Start from y’ = ay. Take the derivative of that equation. Take the n" derivative. 
Construct the Taylor series that matches all these derivatives at t = 0, starting from 
1+ at + $(at)?. Confirm that this series for y(t) is the series for e** in Problem 5. 
7 At what times ¢ do these events happen ? 
(a) ett =e (b) eat — @2 (c) er(t+2) — eate2a. 


8 If you multiply the series for e*’ in Problem 5 by itself you should get the series 
for e?°*, Multiply the first 3 terms by the same 3 terms to see the first 3 terms in e?7. 


16 


11 


12 


13 
14 


15 
16 


17 


18 


19 


20 


21 


22 
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(recommended) Find y(t) if dy/dt = ay and y(T’) = 1 (instead of y(0) = 1). 


(a) If dy/dt = (In 2)y, explain why y(1) = 2y(0). 
(b) If dy/dt = —(In2)y, how is y(1) related to y(0) ? 


In a one-year investment of y(0) = $100, suppose the interest rate jumps from 
6% to 10% after six months. Does the equivalent rate for a whole year equal 8%, 
or more than 8%, or less than 8% ? 


If you invest y(0) = $100 at 4% interest compounded continuously, then 
dy/dt = .04y. Why do you have more than $104 at the end of the year ? 


What linear differential equation dy/dt = a(t)y is satisfied by y(t) = e©8* 9 


If the interest rate is a = 0.1 per year in y’ = ay, how many years does it take for 
your investment to be multiplied by e ? How many years to be multiplied by e? ? 


Write the first four terms in the series for y = e’’. Check that dy/dt = 2ty. 
t 

Find the derivative of Y(t) = (1 + —)”. If nis large, this dY/dt is close to Y ! 
n 


Suppose the exponent in y = e”) is u(t) = integral of a(t). What equation 
dy/dt = y does this solve ? If u(0) = O what is the starting value y(0) ? 


Challenge Problems 
ed/de — 1 + d/dx + 3(d/dz)? + --- is a sum of higher and higher derivatives. 


Applying this series to f(x) at x = 0 would give f + f’ +3 f"+--- atx =0. 
The Taylor series says: This is equal to f(x) atx = : 


(Computer or calculator, 2.xx is close enough) Find the time t when e& = 10. 
The initial y(0) has increased by an order of magnitude—a factor of 10. The 
exact statement of the answer is t = . At what time t does e* reach 100? 


The most important curve in probability is the bell-shaped graph of eH 12, 


With a calculator or computer find this function at t = —2,—1,0,1,2. Sketch 
the graph of e-*’/2 from t = —00 to t = oo. It never goes below zero. 
Explain why y, = e(¢++°)t is the same as yo = e%tetect. They both start at 


y(0) = 1. They both solve what differential equation ? 


For y’ = y with a = 1, Euler’s first step chooses Y; = (1 + At)Yo. Backward 
Euler chooses Y; = Yo/(1 — At). Explain why 1 + At is smaller than the exact e“* 
and 1/(1 — At) is larger than e%*. (Compare the series for 1/(1 — a) with e”.) 


Note Section 3.5 presents an accurate Runge-Kutta method that captures three 
more terms of e?4! than Euler. For dy/dt = ay here is the step to Y,41: 


2At2 3 Az 4At4 
Runge-Kutta for y’=ay Yn41 =(1 bade += aia : gee =i | Ya 
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1.4 Four Particular Solutions 


The equation dy/dt = ay is solved by y(t) = e*’y(0). All the input is in that starting 
value y(0). The solution grows exponentially when a > 0 and it decays when a < 0. 
This section allows new inputs q(t) after the starting time t = 0. That input g is a 
“source” when we add to y(t), and a “sink” when we subtract. If y(t) is the balance 
in a bank account at time ¢, then q(t) is the rate of new deposits and withdrawals. 

The basic first order linear differential equation (1) is fundamental to this course. 
We must and will solve this equation. Please pay attention to this section. In every way, 
this Section 1.4 is important. 


d 
= = ay+ q(t) starting from y(0) att = 0. (1) 


Important I will separate the solution y(t) into two parts. One part comes from the 
starting value y(0). The other part comes from the source term q(t). This separation is 
a crucial step for all linear equations, and I take this chance to give names to the two parts. 
The part y, = Ce is what we already know. The part y, from the source q(t) is new. 


1 Homogeneous solution or null solution y,,(¢) with no source: g = O 


This part y(t) = Ce solves the equation dy/dt = ay. The source term gq is zero 
(null). We are really solving y’ — ay = 0, an equation with zero on the right hand side. 
That equation is homogeneous—we can multiply a solution by any constant to get 
another solution cy(t). This book will choose the simpler word null and the subscript n, 
because this connects differential equations to linear algebra. 


2 Particular solution y,(t) with source q(t) 


This part yp(t) comes from the source term q(t). The previous section had no source 
and therefore no reason to mention y,(t). | Now our whole task is to find a 
particular solution y,(t), because the null solutions y,,(t) = Ce are already set. 

3 The complete solution is y(t) = yn(t) + yp(t) 


For linear equations—and only for linear equations—adding the two parts gives the complete 
solution y = Yn + Yp. This is also called the “general solution.” 


Null y, — Qn = O Yn can start from y(0) 
Particular Y, — Gp = Q(t) yp can start from y =0 
y=YUntyp yo — ay = q(t) ymuststart from y(0) 


A nonlinear equation could include a quadratic term y?. In that case adding y,,? to yp? 
would not give (Yn + Yp)?. The null equation y’ — y? = 0 would not be homogeneous, 
and we can’t multiply y by a constant C’. This will happen for the “logistic equation” in 
Section 1.7. You will see that y(0) enters the solution y(t) in a more complicated way. 

The back cover of this book shows one particular solution y, combining with all null 
solutions y,,. This important picture is repeated for matrix equations and linear algebra. 
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Particular Solutions and the Complete Solution 


We can draw the complete solution to u + v = 6. These points (u, v) fill a straight line. We 
can also draw all the null solutions to u + v = 0. They fill a parallel straight line, going 
through the center point (0,0). Figure 1.2 shows how the null solutions combine with one 
particular solution (3, 3) to give the line of complete solutions. 


Vv 


complete line 
null line 


yn = (C,-C) one particular solution yp = (3,3) 


another particular solution yp = (6, 0) 
—_—_> y, 


utv=6 
ut+tv=0 
Figure 1.2: By adding all the null solutions to one particular solution, you get every solution 


(the complete line). You can start from any particular y, that solves u + v = 6. 


Starting from y, = (3,3), the complete solution has u = 3+ C andv = 3-—C. 
This includes a null solution C + (—C) = 0, plus the particular solution 3+ 3 = 6. 


Null Un t+ tn = 0 (Cs + (-C) = 0 
Particular Up + UU = 6 3 + 3 = 6 
Complete ei eee (3+C) + (83-C) = 6 


The null solution (C’, —C) allows any constant C (like y(0)). The particular solution could 
have any numbers u, and v, that add to 6. We made a special choice up = 3 and vp, = 3. 
In the equation y’ — ay = q we will often make the special choice y,(0) = 0. 

There are many particular solutions! You could say that we chose a very particular 
solution. In the differential equation we chose to start from y,(0) = 0. For the equation 
u-+v = 6 we chose u = 3 and v = 3. We could equally well choose u = 6 and v = 0. This 
particular solution is different, but we get the same complete solution line: 


Ycomplete = (6 + c,0 —c) is the same solution line as Yycomplete = (3 + C,3 — C). 


If cis 5, then C' is 8. From all c’s and all C’s, you get the same line. 


I want to repeat this pattern of null solution plus particular solution by showing 
how it looks for an ordinary matrix equation Av = b (Chapter 4 explains matrices) : 


Null solution Av,, =O Particular solution Av, = b Complete solution v = v, + vp 


Always the key is linearity: Av equals Av, + Avy. Therefore Av = 0+ b = b. 
Often the only solution to Av, = 0 is v;, = O. Then a particular solution v, is also 
the complete solution. This will happen when A is an “invertible matrix.” 
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Inputs q(t) and Responses y(t) 


For any input source q(t), equation (4) will solve dy/dt = ay + q(t). But when 
mathematics is applied to science and engineering and our society, problems don’t 
involve “any q(t).’ Certain functions q(t) are the most important. Those functions are 
constantly met in applied mathematics. Here is a short list of special inputs : 


1. Constant source q(t) = 

2. Step functionatT q(t) = H(t—T) 
3. DeltafunctionatT q(t) = 6(t — T) 
4. Exponential q(t) = et 


This section will solve dy/dt = ay + q(t) for the four functions on that short list. 
The next section adds one more source q(t). It is a combination of sine and cosine. 
Or q(t) can be a complex exponential (which has one term and is usually easier): 


5. Sinusoid q(t) = Acoswt + Bsinwt or Re’? 


Solving Linear Equations by an Integrating Factor 


The equation y’ = ay + q is so important that I will solve it in different ways. The first way 
uses an integrating factor M(t). Put both y terms on the left. Keep q(t) on the right. 


Problem Solve y’ — ay = q(t) starting from any y(0) 
Method Multiply both sides by the integrating factor M(t) = e~™. 


We chose that factor e~* so that M times y’ — ay is exactly the derivative of My: 


d 
y= a MY): (2) 


—at 


Perfect derivative esol 


! : d 
y’ —ay) agrees with Fra 


When both sides of y’ — ay = q are multiplied by M = e~°, our equation is immediately 
ready to be integrated. The right side is Mq, the left side is the derivative of My. 


The integral of <(My) =Mq is M(t) y(t) — M(0)y(0) = [ Misyas) ds (3) 


At t = 0 we know that M(0) = e° = 1. Multiply both sides of equation (3) by e% 
(which is 1/M) to see y(t) = Yn + Yp. This solution comes many times in the book! 
To give meaning to formula (4), I will apply it to the most important inputs q(t). 


t 


The key formula on ae a eae 
Solution to y’ = ay + q(é) y(t) = e®’ y(0) +e e q(s)ds. (4) 
oO 
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Constant Source q(t) = q 


When q(t) is a constant, the integration for the particular solution in equation (4) is easy. 


t 
Bs SS 

ae qds = [* =F ae: 
—a s=0 a 


0 


Multiply by e** to find y,(t). An important solution to an important equation. 
Solution for constant source g y(t) = e% y(0) + Test —1) (5) 
a 


Example | has a positive growth rate a > 0. The solution will increase when g > 0. 
Example 2 will have a negative rate a < 0. In that case y(t) approaches a steady state. 


Example 1 Solve dy/dt — 5y = 3 starting from y(0) = 2. Here a = 5 andg = 3. 
This fits perfectly with y’ — ay = q. Equation (5) gives the solution y(t) : 
Solution y(t) = Yn + yp = 2e** + 3(e** — 1). Set t = 0 to check that y(0) = 2. 
Looking at that solution, I have to admit that y’ — 5y = 3 is not so obvious. This becomes 
much clearer when the two parts (null + particular) are separated : 

Yn(t) = 2e%° certainly has y!, —5yn=0 with yn(0) =2 

yp(t) = 3(e*—1) has y = 3e°*. This agrees with 5y + 3. 


Example 2 Solve dy/dt = 3 — 6y starting from y(0) = 2. 


Formula (5) still gives the answer, but this y(¢) is decreasing because a = —6 is negative : 
p 3 3 1 
tieve oe. ered) e eS. 
ult) qe ace rs 

When t = 0, that solution starts at y(0) = 2. The solution decreases because of e=o8 


As t — co the solution approaches y.. = >. This value —q/a at t = 00 is a steady state. 


q 


1 ae axdy dy ; 
At y = —— = = the equation — = 3 — 6y becomes — = O. Nothing moves. 
a 2 dt dt 


Please notice that the steady state is yoo = 4 for every initial value y(0). That is because the 
null solution y, = y(0)e~® approaches zero. It is the particular solution that balances the 
source term g = 3 with the decay term ay = —6y to approach yu = —q/a = 3/6. 


Question If y(0) = 4, what is y(t)? Answer y(t) = + at all times. 6y balances 3. 
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y(0) = 3/4 
stead 
y(0) = 1/2 }----+----+------- See «+e. stare? 
Every starting value leads to 
y(0) =0 
Figure 1.3: When a is negative, e’ approaches zero and y(t) approaches yo. = —q/a. 
Here is an important way to rewrite that basic equation y’ = ay + q whena < 0. 


The right hand side is the same as a(y + 7). But y + 4 is exactly the distance y — Yoo. 
Rewrite y’ = ay + qasan easy equation Y ’ = aY by introducing Y = y — yqo. 


New unknown Y = y — y.. New equation Y’ = aY Newstart Y(0) = y(0) — yoo 


The solution to Y’ = aY is certainly Y(t) = Y(0)e™. This approaches Y,, = 0 when 
a < 0. The original y = Y + yo still approaches y,.. which is —q/a: see Figure 1.3. 


(Y — Yoo)" =a(y— Yoo) hassolution y(t) — yo =e*(y(0)—yu) (©) 


Section 1.6 will present physical examples with a < 0: Newton’s Law of Cooling, 
the level of messenger RNA, the decaying concentration of a drug in the bloodstream. 
Step Function 


The unit step function or “Heaviside step function” H(t) jumps from 0 to 1 att = 0. 
Figure 1.4 shows its graph. The effect of H(t) is like turning on a switch. 

The second graph shows a shifted step function H(t — T) which jumps from 0 to 1 
at time 7’. This is the moment when t — T’ = 0, so H jumps at that moment T’. 


H(t) H(t-T) 
| jump from 0 to 1 | jump at time T’ 
t t 


Figure 1.4: The unit step function is H(t). Its shift H(t — T’) jumps to 1 att = T. 
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When the step comes at ¢ = 0, the solution to y’ — ay = H(t) is the step response 
That step response is easy to find because this equation is simply y’ — ay = 1. 
The starting value is y(0) = 0. Put g = 1 into formula (5): 


1 
Step response y(t) = —(e% — 1) (7) 
a 


The interesting case is a < 0. The solution starts at y(0) = 0. It grows to y(oo) = —1/a. 
The system rises to that steady state after the switch is turned on. The graph of y(t) is 
the bottom curve in Figure 1.3, except that y., is 1/6 because the step function has q = 1. 


The step response is the output y(t) when the step function is the input. We are depositing 
at a constant rate q = 1. But when a < 0, we are losing ay in real value because of inflation. 
Then growth stops at y = —1/a, where the deposits just balance the loss. 

Now turn on the switch at time T instead of time 0. The step function H(t — T’) is 
piecewise constant with two pieces: zero and one. If I multiply by any constant gq, the source 
q H(t — T) jumps from 0 to strength q at time T. 

The left side of our differential equation is still y’ — ay, no change. The integrating 
factor M = e~% still makes that into a perfect derivative: M(y’ — ay) equals (My)’. 
The only change is on the right side, where the constant source doesn’t start acting 
until the jump time 7’. At that time, the step function source H(t — T) is turned on: 


t 
(ey) =e-“ H(t—T) now gives e~ y(t) — e® y(0) = jee ds. (8) 
T 


The only change for t > T is to start that integral at the turn-on time T': 


o 
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1 

—(e-oT == ety. (9) 
a 

Multiply by e“ to get the particular solution yp(t) beyond time T’, and add y,, = e*’y(0). 


1 
Solution with unit step y(t) = e®ty(0) + is (e*¢-T) _1) for t > T. | (10) 


As always, y(0) grows or decays with e® in the null solution y,,. The step response is the 
particular solution, as soon as the input begins. But nothing enters until time T. 


Example 3 Suppose the input turns on at time ¢ = 0 and turns off at t = T’.. Find y(t). 


Solution The input is H(t)—H(t—T)). The outputis y(t) = + (e* — e-T)) ,t > T. 
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Delta Function 


Now we meet a remarkable function 6(t). This “delta function” is everywhere zero, except 
at the instant ¢ = 0. In that one moment it gives a unit input. Instead of a continuing 
source spread out over time, 5(¢) is a point source completely concentrated at t = 0. 

For a point source shifted to 6(t — T), everything enters exactly at time T. 
There is no source before that time or after that time. The delta function is zero except 
at one point. This “impulse ” is by no means an ordinary function. 

Here is one way to think about 6(¢). The delta function is the derivative of the unit step 
function H(t). But H is constant and dH/dt is zero except att = 0. Take 
the integral of 6(t) = dH/dt from any negative number N to any positive number P. 


P P 
Integral of 5(¢) is 1 [oe a= [ Tat =n(P)- HW) =1-0. (1) 
N N 


“The area under the graph of 6(t) is 1. All that area is above the single point t = 0.” 
Those words are in quotes because area at a point is impossible for ordinary functions. 6(t) 
may seem new and strange (it is useful!). Look at dR/dt = H and dH/dt = 6. 


H(t) = dR/dt 6(t) = dH/dt 
is 


0 (slope 1) (jump 1) 0 (area 1) 


Slope of the ramp jumps to 1. Slope of the step function is the delta function. 
The value of 6(0) is infinite. But that one word does not give full information. 
The real way to understand delta functions is by their integrals. 


yi i(t) dt =1 v 5(t) F(t) dt = F(0) / i(t—T) F(t)dt = F(T) G2 
Please visualize a tall thin box function—equal to 1/h between t = 0 andt = h. 


Now imagine h going to zero. The width h becomes zero and the height 1/h becomes 
infinite. The area stays at 1. All integrals of 6(t) F(t) are concentrated at t = 0: the “spike” . 

Here is a quick way to solve y’ — ay = 6(t), and then we will do it more slowly. We 
know that the derivative of a step function H(t) is the delta function 6(t). So the derivative 
of the step response must be the impulse response : 


d d d at __] i 
—(step) =delta — ( =tep ) = — an = er = impulse (13) 
dt dt \ response dt a response 
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The Impulse Response Solves y’ — ay = 6(t) 


Start your bank account with one deposit. Start your heart with a sudden shock. 
Hit a golf ball. Fire a bullet. Many motions start with an “impulse” and then the source 
term is a delta function d(t). 

The impulse response y(t) jumps immediately to y(0) = 1. You can see that by 
integrating every term in dy/dt — ay = 4(t). Integrating 6(t) from t = —h to h gives 1. 
Integrating dy/dt gives y(h) — y(—h), which is y(h). The integral of ay becomes zero as 
h — 0. That limit step when h — 0 leaves y(O) = 1. 

After the jump to y(0) = 1, the impulse 4(t) is immediately zero. So we just have the 
ordinary null solution to y’ = ay starting from y(0) = 1: 


Impulse response y’ —ay = 6(t) y(t) > e* (14) 


Notice the different responses to an impulse and a step function. The impulse deposits 
everything at t = 0. The step function goes on depositing forever. If a < 0 and inflation 
reduces our wealth, the impulse response dies out to yoo = 0. The step response increases 
from 0 to yoo = —1/a, where the deposits balance the loss from inflation. 

I want to emphasize: e®* is the growth or decay factor G(t) for all inputs. When the 
input is y(0), the output at time ¢ is e*’y(0). When the input is q(s) at time s, 
the output later at t is e?~*)q(s). The growth is only over the remaining time t — s. 
Our main formula (4) is adding up all the outputs that come from all the inputs. 


Delayed Delta Function 


The source g(t) = 6(t — T) turns on at time T. Then immediately it turns off. In that 
one instant of time, the value of y jumps by 1. “We deposited $1 at that moment.” 
The integral of dy/dt = 6(t — T) is 1. This is the change in y, before T to after T. 


Coming up to time T, the solution is y(t) = e%¢y(0). At time T we add 1. After 
time 7, that input has the shorter period t — T in which to grow. Multiply 1 by e@¢—7) : 


Solution for g = 6(t—T) y(t) = yn(t) + yp(t) = e® y(0) + e@-7). | (15) 


The solution y jumps by e? (7-7) = e° = 1, when that second term appears at t = T. 


Example 4 Solve the equation y’ — 5y = 36(t — 4) starting from y(0) = 2. 

The null solution to y’ — 5y = O starting at y(0) = 2 is yn(t) = 2e5*. This we know. 
The particular solution is yp(t) = 0 up tot = 4. At that moment y jumps by 3, from 30. 
Its growth factor is e°(*-4). Then yp(t) = 3e5¢—) after t = 4. 


Complete solution with jump of 3. yn + yp = 2e5* + 3e5¢—-4) H(t — 4) (16) 


The step function H(t — 4) combines y, = 0 before the jump and y, after the jump into 
one formula. At t = 4 the solution jumps by 3. Then this 3 grows to 3e°(’-4), 
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Remark 1 This solution makes me realize that the initial value y(0) is like having a delta 
function at time t = 0. The solution “jumps” to y(0). I don’t know if you agree with that. 


Remark 2 q(t) = —6(t — T) would be negative (a sink instead of a source). A bank 
account could be earning interest at the rate a, and suddenly you withdraw 1 at time 7’. The 
balance y(T) had reached e*7 y(0), and it drops by 1. From time T' onwards, the growth 
factor e*(*—7) multiplies the new balance, and y(t) = e**y(0) — e%-7), 


Remark 3 (a little mysterious) We could think of an ordinary continuous input g(t) as 
a lot of delta functions—a delta function of strength q(T) at every time T. Instead of 
“a lot” I need to say “an integral”. Every continuous function q(t) is an integral of delta 
functions q(T’) 6(t — T)) at all T. The integral picks out q(t) at the spike point. 


Any q(t) = combination of delta functions = i q(T) 6(t — T) aT. (17) 


Example 5 (q = 1) The integral of all impulses for T > 0 is the step function H(t). 


Then the integral of all impulse responses is the step response. The integral of e** from 
0 to tis (e* — 1)/a. Derivative of step response = impulse response as in (13). 


Exponential Input e@? 


The source g(t) = e@ starts at time zero and continues forever. The particular solution 
Yp(t) is easy to find, because y, is a multiple Ye of this same exponential e~. 
That is the beauty of exponentials . These are the most important functions and the best to 
work with. They allow growth or decay or oscillation from c > 0 and c < 0 and c = iw. 


Substitute y, = Ye into y’ — ay = e* cYe* — aYe* = e* 


When we cancel e“ this leaves a simple formula for the number Y in Ye“ : 


cY —aY =1 __ gives Y = — and Yp(t) = 
c-—a c 


e 
—- a 
—a 


Example 6 Solve y’ — 5y = 3e** starting from y(0) = 2. Now Y = erga 
c—a _ 


The null solution still involves e*’. The particular solution is Y times e*¢ ! 


ysl) = Ye“ Yp — 5Yp = (4Y — 5Y)e* = 3c. Then Y = —3. 
This particular solution —3e* starts at —3. Since y(0) = 2, the other part starts at +5. 


Complete solution y(t) = 5e** — 3e%*. 
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The null solution grows at rate a = 5. One particular solution grows at rate c = 4. 
The equation y’ — ay = e“ is solved for c 4 a but two final comments are needed. 


1. This particular solution y(t) = e/(c — a) is not the “very particular” solution 
that starts from yp(0) = 0. It is still perfectly good, except it starts at 1/(c — a). 
So the complete solution starting at y(0) has to include the usual y(0)e and 
also a term to cancel 1/(c — a) at time zero: 


at ct 


y’ — ay = pee Ycomplete = y(0) et — —_- 
c—-a c—-a 


(19) 


There you see a null solution y, (two terms) and our particular yp (the last term). 
Or the last two terms together are the very particular solution (e“* — e%*)/(c — a). 


2. For c = a we are in Serious trouble. The formulas fail because we can’t divide by 
c—a=0. This problem y’ — ay = e®* is a type of resonance , when the exponent 
c in the source happens to equal the exponent a in the natural growth from y’ = ay. 
The integral in our main formula (4) becomes if En eS ds ii lds =t. 


Resonance c=a y’ — ay = et y = y(0)e* + te (20) 


That extra growth factor t is because y, resonates with y,. They both have e%. 


= REVIEW OF THE KEYIDEAS #8 


1. Complete solution to a linear equation = null solution(s) + particular solution. 


2. The integrating factor e~*’ multiplies y’ — ay = q(t) to give (e~“ y)! =e q(t). 
Integrate and multiply by e* : y(t) = Yn + Yp = e“*y(0) + e* f e~%%q(s) ds. 


3. For y’ — ay = q = constant, the particular solution with y,(0) = 0 is q(e** — 1)/a. 
4. q(t) = H(t): the response to a unit step function is yp = (e** — 1)/a. 
5. q(t) = 6(t) : the impulse response to a unit delta function is yp = e%*. 


6. q(t) = e* gives y, = (e* — e%)/(c — a). In case c = a, change to y, = te. 
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Problem Set 1.4 


1 All solutions to dy/dt = —y + 2 approach the steady state where dy/dt is zero and 
Y = Yoo = __. That constant y = Yoo 18 a particular solution yp. 


Which y, = Ce~* combines with this steady state y, to start from y(0) = 4? 
This question chose yp + Yn to be Yoo+ transient (decaying to zero). 

2 For the same equation dy/dt = —y + 2, choose the null solution y,, that starts from 
y(0) = 4. Find the particular solution yp that starts from y(0) = 0. 
This splitting chooses the two parts e®*y(0) + integral of e*(t—s)q in equation (4). 


3 The equation dy/dt = —2y + 8 has two natural splittings ys + yr = yn + YP: 
1. Steady (ys = Yoo) + Transient (yr — 0). What are those parts if y(0) = 6? 
2. (yn = —2yn from yn (0) = 6) + (yp = —2yp + 8 starting from yp(0) = 0). 
3 


4 All null solutions to wu — 2v = 0 have the form (u, v) = (c, 


One particular solution to u — 2v = 3 has the form (u,v) = (7, 


Every solution to u — 2v = 3 has the form (7, )4+c(1, Ne 
But also every solution has the form (3, )+C(, ) for C =c+4. 

5 The equation dy/dt = 5 with y(0) = 2 is solved by y = . A natural split- 
ting yn(t) = __ and y(t) = __ comes from y,, = e**y(0) and yp» = f e%*-*)5 ds. 


This small example has a = 0 (so ay is absent) and c = 0 (the source is g = 5e™). 


When a = c we have “resonance.” A factor ¢ will appear in the solution y. 
Starting with Problem 6, choose the very particular y, that starts from yp,(0) = 0. 
6 For these equations starting at y(0) = 1, find y,,(t) and yp(t) and y(t) = Yn + Yp. 

(a) y’ —9y = 90 (b) y’ + 9y = 90 

7 Find a linear differential equation that produces y,,(t) = e¢ and y,(t) = 5(e8 — 1). 
8 Find a resonant equation (a = c) that produces y,(t) = e7* and yp(t) = 3te*. 
9 y’ =3y+e* has y, = e%y(0). Find the resonant yp with yp(0) = 0. 
Problems 10-13 are about y’ — ay = constant source q. 


at 


10 Solve these linear equations in the form y = yn + Yp with yn = y(O)e 
(a) y' —4y = -8 (b) y’ +4y =8 Which one has a steady state ? 
11. Find a formula for y(t) with y(0) = 1 and draw its graph. What is yoo ? 


(a) y’ +2y=6 (b) y' + 2y = -6 
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12 Write the equations in Problem 11 as Y’ = —2Y with Y = y — yoo. What is Y(0)? 


13 = Ifadrip feeds g = 0.3 grams per minute into your arm, and your body eliminates the 
drug at the rate 6y grams per minute, what is the steady state concentration y.. ? Then 
in = out and Yoo is constant. Write a differential equation for Y = y — yoo. 


Problems 14-18 are about y’ — ay = step function H(t — T): 

14 Why is yo the same for y’ + y = H(t — 2) andy’ + y = H(t — 10)? 

15 Draw the ramp function that solves y’ = H(t — T) with y(0) = 2. 

16 Find y,,(t) and y,(t) as in equation (10), with step function inputs starting at T = 4. 
(a) y’ — 5y = 3H(t — 4) (b) y) +y=T7H(t—4)  (Whatis y?) 


17 Suppose the step function turns on at T = 4 and off at T = 6. Then q(t) = 
H(t — 4) — H(t — 6). Starting from y(0) = 0, solve y’ + 2y = q(t). What is yo? 


18 Suppose y’ = H(t—1)+ A(t — 2) + H(t — 3), starting at y(0) = 0. Find y(t). 


Problems 19-25 are about delta functions and solutions to y’ — ay = q 6(t — T). 


19 For all t > 0 find these integrals a(t), b(t), c(t) of point sources and graph b(t): 


ofa Tar of or =$(F =3)\\at of Eos Dayar 


20 Why are these answers reasonable ? (They are all correct.) 


(a) [ etsteyat = om [ w \edt = co © | e'6(t —T)dT = et 


21 = The solution to y' = 2y + 6(t — 3) jumps up by 1 at t = 3. Before and after t = 3, 
the delta function is zero and y grows like e?’. Draw the graph of y(t) when 
(a) y(0) =Oand(b) y(0) = 1. Write formulas for y(t) before and after t = 3. 


22 Solve these differential equations starting at y(0) = 2: 

(a) y’ —y = 6(t — 2) (b) y +y=d(t-—2). (What is yoo ?) 
23 Solve dy/dt = H(t — 1) + 6(t — 1) starting from y(0) = 0: jump and ramp. 
24 (My small favorite) What is the steady state y., for y’ = —y+ 6(t —1) + H(t— 3)? 


25 Which gq and y(0) in y’ — 3y = q(t) produce the step solution y(t) = H(t — 1)? 
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Problems 26-31 are about exponential sources q(t) = Qe“ and resonance. 


26 


27 


28 


29 
30 


31 


32 


33 


34 


35 


Solve these equations y’ — ay = Qe“ as in (19), starting from y(0) = 2: 
(a) y’ —y = 8e** (b) y’ +y = 8e-** (What is Yoo 2) 


When c = 2.01 is very close to a = 2, solve y’ — 2y = e starting from y(0) = 1. By 
hand or by computer, draw the graph of y(t) : near resonance. 


When c = 2 is exactly equal to a = 2, solve y’ — 2y = e starting from y(0) = 1. 
This is resonance as in equation (20). By hand or computer, draw the graph of y(t). 


Solve y' + 4y = 8e~* + 20 starting from y(0) = 0. What is yoo? 


The solution to 4’ —ay = e“ didn’t come from the main formula (4), but it could. Inte- 
grate e~*%e°* in (4) to reach the very particular solution (e — e%)/(c — a). 


The easiest possible equation y' = 1 has resonance! The solution y = t shows the 
factor t. What number is the growth rate a and also the exponent c in the source? 


Suppose you know two solutions y; and yz to the equation y’ — a(t)y = q(t). 


(a) Find a null solution to y’ — a(t)y = 0. 


(b) Find all null solutions y,,. Find all particular solutions y,. 


Turn back to the first page of this Section 1.4. Without looking, can you write down a 
solution to y’ — ay = q(t) for all four source functions g, H(t), 6(t), e? 


Three of those sources in Problem 33 are actually the same, if you choose the right 
values for g and c and y(0). What are those values ? 


What differential equations y’ = ay + q(t) would be solved by y(t) and yo(t) ? 
Jumps, ramps, corners—maybe harder than expected (math.mit.edu/dela/Pset1 .4). 


* yi(t) 
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1.5 Real and Complex Sinusoids 


Section 1.4 ended with the equation y’ — ay = e®. A particular solution was easy to 
produce, because we kept e“. We simply chose the correct multiplier Y = 1/(c — a) in 
yp(t) = Ye. This section changes the real number c to an imaginary number iw. 
The multiplier is now Y = 1/(iw — a) . The solution formula Ye* will stay exactly 
the same, but we need complex numbers (with real part and imaginary part). The 
payoff is that we can solve all real problems y’ — ay = Acoswt + Bsinwt at once. 


Many scientific and engineering applications are driven by sources q(t) that oscillate 
like coswt and sinwt (sinusoids ). Pistons go up and down to drive a car, voltages go 
up and down to drive current (alternating current). The input frequency is w, and the output 
frequency is also w. The problem is to find the amplitude and the phase in the output 
(the response to the input). The real solution will be y= M coswt + WN sinwt. 

This y(t) will be a particular solution (steady solution). It is not the transient solution 
Yn(t) that decays to zero. We solve y' — ay = q(t) when the source q(t) is a sinusoid. 
For this section and the next, applications come from biology and chemistry and medicine 
and more. The number a is often a rate constant. It tells the speed of a chemical reaction. 

Note that RLC circuits (resistor-inductor-capacitor) produce equations with second 
derivatives. Those will go into Chapter 2, but RC and RL circuits (first order equations) 
belong here. Our plan for this section is straightforward: Real then complex. 


1 (Real) Solve dy/dt — ay = q(t) = Acoswt + B sinwt. 
This leads to two equations for the two coefficients M, N iny = M coswt + N sinwt. 
2 (Complex) Solve dy/dt — ay = q(t) = Re*”. 


This leads to one easy equation for the coefficient in y = Ye**'. But that number Y is 
complex, so we still have two real numbers to find (real and imaginary parts of Y). 


3 (Akey idea) Write the complex number 1/(iw — a) in its polar form G e~**. 


The positive number G is the gain . The angle a is the phase lag . Those have impor- 
tant meanings and they are perfect to graph separately. In many problems (most problems) 
G and a are more useful than the real and imaginary parts of 1/(tw — a). 


So we need to explain and review complex numbers. They are worth knowing and 
not difficult. The next page will solve the real problem 1 and the complex problem 2. We 
can’t simplify the real problem by using cosines alone, because the term dy/dt in the equa- 
tion would unavoidably involve sin wt. 


The Review of the Key Ideas at the end organizes the important steps. 


Real Sinusoids 


We want a particular real solution y(t) when the source q(t) oscillates with frequency w. 


d F 
First order linear equation = —ay = Acoswt + Bsinwt. 


(1) 
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The solution will have the same form y = M coswt + N sinwt as the source term. By 
matching the coswt terms and separately the sin wt terms, you get two equations for M and 
N. Just subtract ay = aM coswt + aN sinwt from dy/dt = —wM sinwt +wN cosut. 


dy cos wt terms -aM+wN=A 
rap = 9 ; (2) 
dt sin wt terms —-wM-—aN=B8B 


Those two equations tell us M and N in the real solution y(t) = M coswt + N sinwt. 
I will write down the solution to equation (2), and then describe two ways to find it. 


Source g= Acoswt+ Bsinwt _aA+wB ny - wAa eB (3) 


Solution y = M coswt+ WN sinwt we + QQ? w + a? 


I would find N by eliminating M in equation (2). If you multiply the first equation by w 
and the second equation by a, then subtraction removes M. The right side is wA — aB, 
the left side is (w? + a2)N. Then N is correct in equation (3). Similarly we find M. 


For two equations it is also practical to find / and N from the 2 by 2 inverse matrix : 


—a Ww My |) oA gh ME ee 1 -a -w A 

—w -a INS NO NGB a N | w2+4+a? Ww —a Beale 
The matrix on the left times its inverse on the right gives the identity matrix J in Chapter 4. 
That denominator w? + a? of the inverse matrix appears in M and N, in the solution (3). 


Complex Sinusoid e?“? 


Now we come to the very important input q(t) = Re***. That input is oscillating with 
frequency w radians per second. The output y(t) will oscillate with the same frequency w. 
This is true because a is constant in the differential equation. When y(t) = Y e*”¢ includes 
the same factor e”*, that factor cancels from every term in the equation: 


q(t) = Re" 


/ ; - : ; 
y(t) = VY eivt y —ay=q becomes iwYe! — gycivt — Ret. (4) 


When we divide by e*”®, this leaves an easy algebra problem for the complex number Y : 


R : 
Response Y(w) iwY —aY=R_ gives Y= - and y = Yet, | (5) 
iw —a 


The simplicity of the solution y = Ye“* comes from one key fact : The derivative of e*”¢ is a 
multiple of e*”* (the multiplying factor is iw). This was not true for coswt. 
Its derivative brings in sinwt. So we had to solve two real equations for M and N, 
while (5) is one complex equation for Y. 
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Complex Numbers : Rectangular and Polar 


The complex number z = z + iy has real part x and imaginary part y. The basic ideas 
are explained here; more details are in Section 2.2. We plot all z in the complex plane 
(the real-imaginary plane). Figure 1.5 shows the particular number z = 4 + 3% with 
z =Rez = 4andy =Imz = 3. No problem with the rectangular form 4 + 33, 
except that multiplying and dividing are not at all convenient in x — y coordinates. 

The first figure also shows the polar form of the same number z. The magnitude (or 
modulus) is r. The phase is the angle 6. From x and y we can find r and 0. 


The magnitude is r = x? + y? = V25 = 5. The angle 0 has tangent y/x = 3/4. 


g=2+4= V5 e 
different 0 
from part (a) 


Imaginary i T 


v5 
ok Ba Roya 5 eM 


complex conjugate Z 


Real 


x=4=r cos 0 


Figure 1.5: (a) z = 4 + 37 is a point in the complex plane. Its polar form is z = 5e”®. 


The polar form is perfect for multiplication and division of complex numbers. To 
multiply re’’ times Re**, add the angles and multiply r times R. To divide, subtract 
the angles and divide r by R. 


i : s 10 : 
Multiply (re‘”)(Re'®) =rRei+ Divide 7 = z ela) 6) 


The polar form is also perfect for squaring a complex number re”? and for 1/re“? : 


Square 2? = (re™’)(re’”) = r?e7*® ~—s Invert — = — — e-i0 (7) 


reo 


1 1 1 
Let me compare that polar form of 1/z with 1/(a + iy). Multiply by (a —iy)/(@ —iy) = 1. 


1 1 «2-iy «x-—ty 1 M3. DE neag 
= = ————_ ee er 
c+iy a«vt+iyx—iy «22+ y? A+3i 474+37 5 


z 


This number x — 7y appears often. It is the complex conjugate Zz of the number z = x + ty. 
Notice that x + iy times x — iy is x? + y?. In other words z times Z is |z|? = r?. 
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=cos@+i7sin@ 
ei(Ot+a) 


1 he 
= — =cos6é —isin#d 
ef 


Figure 1.6: Points e on the unit circle have r = 1. When e” multiplies e**, angles add. 


The Unit Circle 


Figure 1.6 shows the unit circle , where every radial distance is r = 1. Then we just add 
the angles to multiply, or double the angles to square, or subtract the angles to divide: 
es . : 1 
Onthecircle = (e*)(e*%) = e(9F% ~— (ee) = 1 5 = iat 
e~°? is the complex conjugate of e”’, the mirror image across the axis in Figure 1.6. 


Example 1 Describe the paths of the numbers e** and e*“* and e(*+*)¢ in the complex 
plane (real s and real w). The time ¢ goes from 0 to oo. Those paths start at 1. 


Solution If s > 0, the number e* goes from 1 out the real axis to infinity. If s < 0, 
then e** goes from 1 in to zero. All real. 


The path of e** goes around the unit circle with constant speed. At time T = 27/w 
(and also 27, 37, ...) it comes back to e?"* = 1. The path goes clockwise if w < 0. 


The path of e(*+*)! spirals outward to infinity if s > 0. It spirals inward to zero 
if s < 0. Attime T = 27/w it is a real number e*”, because the factor e”7 = e?7* is 1. 


The Gain G and the Phase Lag a 


The complex number 1/(iw — a) multiplies the input g(t) = Re** to give the output 
y(t) = Ye**. What is the magnitude of 1/(iw — a) and what is its angle? We need 


its polar form 1/(iw — a) = Ge~*™. Start with iw — a = re’® and then invert: 
: : imaginary part w 
jw—-a=re’™ r= Jw +a? and Gin ee ae 
real part a 


We want 1/(re*®). This will be Ge~**. The gainis G = 1/r = 1/Vw? + a?: 


Gain G 1 Deed 1 Sree, ” 
Phase angle a ee ee Vat tae er Ger: (8) 
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-e— - © — ——. —_—s 
Wn 2Wn Wn 2Wn 


Figure 1.7: Dimensionless gain G and phase angle ¢ as functions of frequency w. 


The gain G(w) and the angle a(w) are often graphed. The graphs below are variations of 
“Bode plots.” The amplitude response G(w) is especially important, and 
you are very likely to see that gain G' by itself—often including an extra factor |a]. 


Note One common variation is to include the rate constant a in the forcing term 
q(t) = a Re**, We still think of Re“ as the input, then a gives q the right physical units. 
That factor a will appear in the output. So the gain G = |output| / |input| will be increased by 
that factor |al]. Then G = |a|/Vw2+a? is 1 at the frequency w = 0. 


Sinusoids R cos(wt — ¢) 


The next page will show that any combination of coswt and sin wt is a shifted cosine. It has 
frequency w and amplitude R and phase lag ¢. If you know w and R and 4, it is no problem 
to graph y(t) = R cos(wt — @). To go the other way, and read off those three numbers 
from the graph, is much more interesting. 

This mystery sinusoid came from lecture notes for MIT’s course 18.03. The website 
mathlets.org has interactive experiments. The question here is: Find w, R, and @. 


e an |_| 


AC 
V 


VA ain 
SS 
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The Sinusoidal Identity 


We want to choose the magnitude R and the angle ¢ so that Acoswt + B sin wt 
is the real part of Re’*-%). We can and will solve y’ — ay = Re*(“‘-9%) quickly. 
When we take the real part of all terms in this differential equation, the correct input 
q(t) = Rcos(wt — ¢) will appear on the right side and the correct output y(t) will 
appear on the left side. The real equation will be solved in one step. 

So we want this identity for the “sinusoidal” input q(t) : 


Sinusoidal identity A coswt+ Bsinwt = Rcos(wt — ¢) (9) 


The right side has the same period 27/w as the left side—and only one term. 
To find R and ¢, expand Reos(wt — ¢) into Rcoswtcos ¢+ R sin wtsin ¢. Then 
match cosines to find A and match sines to find B: 


B 
A=Rcos¢ and B=Rsing A* + B* = R? and tan = 5. (10) 


So we know R = VA? + B? and ¢ = tan~!(B/A) in the sinusoidal identity. The beauty of 
Rand ¢ is that they match sinusoids to the polar form of complex numbers. 


a A+iB = Re? polar form of A + 7B 
B R = /A? + B? produces R and ¢ in the 
A tan d =B/A sinusoidal identity (9) 


For practice with this important formula, Problem 1| will develop a slightly different proof. 


Example 2 Write q(t) = cos 3t + sin 3t as Rcos (3t — ¢): the real part of Re*!—9), 
Solution A = 1and B = 1s0 that R = V2. The angle ¢ = % has tang = B/A = 1. 
Then cos 3¢ + sin 3t = V2 cos (3t — ae 


Example 3 Write the real part of e**/(/3 + 7) in the form A cos 5t + B sin 5t. 
Solution \/3 + 7 is 2e**/® (why?) Then e*!/(V3 + 7) is $e%5!-7/9)_ Its real part is 


1 1 3 1 
3 cos (se _ =) = 3 (cos 5t cos x + sin 5t sin 7) = us cos 5t + 7 sin 5t. 


Real Solution y from Complex Solution ye 
The sinusoidal identity solves y’ — ay = A cos wt + B sin wt in three steps: 
1. This equation is the real part of the complex equation y,’ — ayo = Re**t-?), 
2. The complex solution is y, = Re**t-® /(iw — a) = RG e*t- 9-9), 


3. The real part of that complex solution y, is the desired real solution y(t). 
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Those three steps are 1 (real to complex) 2 (solve complex) 3 (complex to real). 
This will succeed. The second step expresses 1/(iw — a) as Ge~** to keep the polar form. 
The third step produces y = M coswt + N sinwt directly as y = RG cos(wt — d — a). 


Example 4 Take those three steps real-complex-real to solve y’ — y = cos t — sin t. 


We have to find R, ¢, G, and a from the numbers a llw 1,A 1, and B = -1. 
Notice that RG = 1. 


Ss B wv d: 1 
R=VA2+B2?=J/2 tand=—=-1 and d6=—-— G=——=— 

v é A # 4 Vwr+a2 V2 
The angle foriw --a=i-lisa= ae Its tangent is —“ = —1. 


1. The sinusoidal identity is cost — sint = V2 cos(t — ¢) = V2 cos(t + 7/4). 


Vi eile /4) 


—ta 1 30 
2. Yeomplex = =e Here ue ree SAG@e aor 


; ; uy 
3. Ycomplex = RG eerre eh et 8/2) Then Yea = cos(t — 5) = sint. 


That example was chosen so that G = 1/ V2 cancelled R = V2. If we keep all the 
symbols R, ¢, G, a then the solution y,.4, = RG cos (wt — ¢ — a) from Step 3 must 
agree with the solution y = M coswt + N sinwt at the start of this section. 

The key point in many applications is not necessarily the numbers in the formula for y(t). 
Very often the goal is to see from the formula how y(t) depends on parameters like a and w in 
the differential equation. The gain G = |output|/|input| is a convenient 
and very important guide. 

The truth is that the complex solution is better. The sinusoidal identity shows how 
every combination Acos wt + Bsin wt is the real part Rcos(wt — ¢) of a complex 
exponential Re**—*), So we can convert real to complex and complex back to real. 

In between, solve the complex form by using the frequency response 1/(iw — a). 


Conclusion When the input q(t) is Re*”’, the output y(t) multiplies by 1/(iw — a). 
This multiplying factor is a complex number, and it changes with the frequency w. 
We absolutely need to understand that number Y and graph its magnitude G and its phase. 


= REVIEW OF THE KEYIDEAS #® 


1. (Real) y’ — ay=Acoswt + B sinwt leads to Yreal = ™M coswt + N sinwt. 


2. (Sinusoidal identity) A coswt + B sinwt equals Rcos(wt — ¢) with R? = A? + B?. 
3. (Complex) y' — ay= Re**-®) leads to Ycomplex = Re“t-9) / (iw — a). 
4. (Complex gain) 1/(iw — a) = Ge-** with G = 1/Vw? + a? and tana = —w/a. 


5. (Real part of the complex solution) Yyeq, = Re( = RG cos(wt — a — @). 


Ycomplex) 
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Problem Set 1.5 


Problems 1-6 are about the sinusoidal identity (9). It is stated again in Problem 1. 


These steps lead again to the sinusoidal identity. This approach doesn’t start with 
the usual formula cos (wt — ¢) = cos wt cos ¢+ sin wt sin ¢ from trigonometry. 
The identity says : 


If A +iB = Re*? then Acoswt + Bsinwt = Reos(wt — ¢). 
Here are the four steps to find that real part of Re““*—®), Explain A — iB in Step 3. 
R cos (wt — $) = Re [Re*t-9)] = Re [e*(Re-*)] = (what is Re~*® ?) 


= Re|[(cos wt +i sin wt) (A —iB)| = A coswt + B sin wt. 


To express sin 5t + cos 5t as R cos(wt — $), what are R and ¢? 
To express 6 cos 2¢+ 8 sin 2tas R cos(2t — ¢), what are R and tan ¢ and ¢? 
Integrate cos wt to find (sin wt) /w in this complex way. 

(i) Wreq,/dt = coswt is the real part of dyeomplex/ at = et. 

(ii) Take the real part of the complex solution. 


The sinusoidal identity for A = 0 and B = —1 says that — sinwt = Rcos(wt — ¢). 
Find R and @. 


Why is the sinusoidal identity useless for the source g(t) = cos t + sin 2t? 


Write 2+3i as re*®, so that 544; = te~**. Then write y = e““*/(2+3i) in polar form. 
Then find the real and imaginary parts of y. And also find those real and imaginary 


parts directly from (2 — 3)e*#?/(2 — 3i)(2 + 32). 


Write these functions Acoswt + Bsinwt in the form Rcos(wt — ¢): Right triangle 
with sides A, B, R and angle ¢. 


1) cos 3t — sin 3t 2) V3cosat — sin at 3) 3cos(t — ¢) + 4sin(t — ¢) 


Problems 9-15 solve real equations using the real formula (3) for / and N. 


9 

10 
11 
12 


Solve dy/dt = 2y + 3cost + 4sint after recognizing a and w. Null solutions Ce?*. 
Find a particular solution to dy/dt = —y — cos 2t. 
What equation y’ — ay = Acoswt+ Bsinwt is solved by y = 3 cos 2t + 4sin 2t? 


The particular solution to y’ = y + cost in Section 1.4 is yp = e! f e~*cossds. 
Look this up or integrate by parts, from s = 0 to t. Compare this y, to formula (3). 
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13 Findasolution y = M cos wt + N sin wt to y’! — 4y = cos 3t + sin 3¢. 
14 Find the solution to y’ — ay = A cos wt + B sin wt starting from y(0) = 0. 


15 Ifa =O show that M and N in equation (3) still solve y’ = A cos wt + B sin wt. 
Problems 16-20 solve the complex equation y’ — ay = Re*(“t-), 


16 = Write down complex solutions y, = Ye‘ to these three equations : 
@py sy =5e"- “(by haere (eo) oy SQy— e* 
17 Find complex solutions z, = Ze”! to these complex equations : 
(a) 2’ +4z = e8 (b) 2! +4iz = (c) z'+4iz = e®* 


18 Start with the real equation y’—ay = R cos (wt —¢). Change to the complex equation 
z' — az = Re*(“t-9). Solve for z(t). Then take its real part y, = Re z. 


19 What is the initial value y,(0) of the particular solution y, from Problem 18? 
If the desired initial value is y(0), how much of the null solution y, = Ce 
would you add to yp ? 


20 =‘ Find the real solution to y’—2y = cos wt starting from y(0) = 0, in three steps: Solve 
the complex equation z’ — 2z = e™*, take yp = Rez, and add the null 
solution y, = Ce’ with the right C. 


Problems 21-27 solve real equations by making them complex. First a note on a. 


Example 4 was y’ — y = cost — sint, with growth rate a = 1 and frequency w = 1. 
The magnitude of iw — a is /2 and the polar angle has tana = —w/a = —1. Notice: 
Both a = 31/4 and a = —1/4 have that tangent |! How to choose the correct angle a? 

The complex number iw — a = 1 — 1 is in the second quadrant. Its angle is a = 37/4. 
We had to look at the actual number and not just the tangent of its angle. 


21. ‘Find r and a to write each iw — a as re’*. Then write 1/re’* as Ge~**. 


@VStr: 6) V8tSl (ey Paa/3 


22 ~Use G and a from Problem 21 to solve (a)-(b)-(c). Then take the real part of each 
equation and the real part of each solution. 


(a) y! +y = eve (b) y! —y = ebv3t (c) y' — V3y = et 
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23 Solve y’ — y = cos wt + sin wt in three steps: real to complex, solve complex, take 
real part. This is an important example. 


(1) Find R and ¢ in the sinusoidal identity to write cos wt + sin wt as the real part 
of Re*t-9), 


(2) Solve y’ —y = e”! by y = Ge-*%e*®, Multiply by Re‘? to solve 
2! —2 = Reilvt-4), 


(3) Take the real part y(t) = Re z(t). Check that y’ — y = cos wt + sin wt. 
24 = Solve y’ — V/3y = cos t+ sin t by the same three steps with a = V3 andw = 1. 


25 (Challenge) Solve y’ — ay = Acoswt + B sin wt in two ways. First, find 
R and ¢ on the right and G and a on the left. Show that the final real solution 
RG cos (wt — ¢ — a) agrees with M cos wt + N sin wt in equation (2). 


26 Wedon’t have resonance for y’ — ay = Re™* when a and w # 0 are real. Why not? 
(Resonance appears when y,, = Ce and y, = Ye“ share the exponent a = c.) 


27 If you took the imaginary part y = Im z of the complex solution to z’—az = Re**-®), 
what equation would y(t) solve ? Answer first with ¢ = 0. 


Problems 28-31 solve first order circuit equations: not RLC but RL and RC. 


V coswt L R Vcoswt R C 
current I(t) q(t) = integral of I(t) 


28 Solve LdI/dt + RI(t) = V cos wt forthe current J(t) = I, +J, in the RL loop. 


29 With CL = 0 andw = QO, that equation is Ohm’s Law V = JR for direct current. 
The complex impedance Z = R + iwL replaces R when L # 0 and I(t) = Ie™*. 


LdI/dt+ RI(t) = (wk + R)Ie™*t = Ve* gives ZI=V. 


What is the magnitude |Z| = |R + iwL|? What is the phase angle in Z = |Z|e**? 
Is the current |/| larger or smaller because of L ? 


d 1 
30 = Solve Ro + rau. = V cos wt for the charge g(t) = dn + gp in the RC loop. 


31. Why is the complex impedance now Z = R+ ne ? Find its magnitude |Z]. 
Note that mathematics prefers 1 = ./—1, we are not conceding yet toj = /—1! 
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1.6 Models of Growth and Decay 


This is an important section. It combines formulas with their applications. The formulas 
solve the key linear equation y’ — a(t)y = q(t)—we are very close to the solution. 
Now a can vary with t. The final step is to see the purpose of those formulas. 

The point of this subject and this course is to understand change. Calculus is about 
change. A differential equation is a model of change. It connects dy/dt to the current value 
of y and to inputs/outputs that produce change. We see this as a math equation and solve it 
by a formula. If we stop there, we miss the whole reason for differential equations. 


I will select five models of growth or decay, and five equations to describe them. 
Often the hardest part is to get the right equation. (Definitely harder than the right solution 
formula.) This section presents both steps of applied mathematics : 


1. From the model to the equation 2. From the equation to the solution. 


Our plan is to take the second step (the easier step) first: Solve the equation. Find the 
output y(t) from inputs a(t) and q(t) and y(0). Then come the models. 

Here is the differential equation for y(t). We want a formula to solve it—and we want to 
understand where that formula comes from. The solution y(t) must use the three inputs a(t) 
and q(t) and y(0), because they define the problem. Sometimes a(t) changes with time. 
This possibility was not allowed in Sections 1.4 and 1.5. 


dy 


Differential equation — =a(t)y+ q(t) | starting from y(0) att=0. (1) 


dt 


Up to now, our models had limited options for those inputs (and a was constant) : 
Growth rate a(t) The classic exponential y(t) = e’ hada = 1 
Source term q(t) Sections 1.4 and 1.5 had five particular inputs like ect and ett 
Initial value y(0) The starting value for y(t) = e’ was y(0) = 1 


The “initial value” y(0) is like a deposit to open a bank account. The source or sink q(t) 
comes from saving or spending as time goes on. The solution y(t) is the balance in the 
account at time t. I will reveal the final formula now, so you know where we are going. 


Growth factor G(s, t) 


t 
from time s to time ¢ y(t) = G(O, t) y(0) + ih G(s, t) q(s) ds. (2) 
ty) 


Formula (2) has two parts. The first part y, = G(0,t)y(0) has g = 0: no source. 
The second part yp introduces the source q(t), which adds fresh growth G times q 
(or subtracts when q(t) is negative). Go forward 2 pages to see the factor G(s, t). 


y = (Null solution with g = 0) + (Particular solution from the input q). 
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Particular Solution from q(t) 


On this page a is constant. The particular solution y,(t) is so important that 
we will reach it in three ways. Of course those three approaches will be closely related— 
but they are different enough and valuable enough to be presented separately : 


1. Integrating factor 2. Variation of parameters 3. Combine all outputs. 


1. The integrating factor M(t) = e~* was seen in Section 1.4. It solves M’ = —aM. 
For constant growth rate a, Sree the equation y’ — ay = q(t) by Me = eq 
turns the left side into an exact derivative of My: 


d —a —@ e * 
qe ¥) =e “(y! — ay) =e “a(t). (3) 
Then we integrate the left and right hand sides to find y = y,(t) with y,(0) = 0: 

t 


t 
a y(t) = | e-** as) ds and y(t) = fet» q(s) ds. (4) 
(0) 


0 


2. Variation of parameters starts with the solutions yn = Ce to the null equation 
y’ — ay = 0. The new idea is to let C vary with time in the particular solution. 
Substitute y = C(t)e™ into the equation y’ — ay = q(t) to find C’e* = q(t): 


(Ce”)' —aCe™ = C’e™ +aCe™ — aCe” = C'e™ = q(t). (5) 
Then C’ = e~ q(t). Integrate to find C and the solution formula we want: 


t 


t 
C(t) = i: en g(s)ds y(t) = Clee" t= [ex a(s) as, 6) 
(0) 


0 


The integrating factor MV changes the equation. Varying C(t) changes the solution. 
C(t) will stay important for systems of n equations ; integrating factors lose out. 


3. Each input q(s) grows to e*‘¢—*) q(s) in the time between s and t. Then the 
solution y(t) comes from these inputs q(t) and growth factor G = e's), 
Add up (integrate) all those outputs : 


t 


Growing time for q(s) ist — s Output y(t) = ence q(s) ds. (7) 
0 


To me, this third approach captures the meaning of the formulas (4) = (6) = (7). I like to 
think of each input g(s) growing by the factor G(s,t) = e%'-S) in the time t — s. 
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Changing Growth Rate a(t) 


The next step is to let a(t) change in time. For example a(t) could be 1 + cost, varying 
between 2 and 0. Certainly interest rates do change. The growth rate a of your bank 
balance often slows down or speeds up. Then the growth factor G(0, t) is not just e%*. 


The null solution to y/, = a(t)yn shows this clearly—the growth from time 0 to time ¢: 


Integrate a from 0 to t 


Take the exponential Yn(t) = G(0,t) y(0). 8) 


The key point is that dG/dt = a(t) G. First, the derivative of the integral of a(t) 
is a(t)—by the Fundamental Theorem of Calculus. Second, the chain rule produces the 
derivative of G, when that integral goes into the exponent. Here is dG/dt: 


d /j ; d dG 
< (nee of 2) = four of ¢) 3 (integral of a) a = (G)(a(t)) (9) 
When a is constant, that integral is just at. This leads to the usual growth G = e. 


When a varies, the exponent is messier than at but the idea is the same: dG/dt = aG. 
Our example is a(t) = 1 + cost. The integral of a(t) is t + sint. This is the exponent: 


Growth factor G(0, t) = e*tsiné Null solution y,,(t) = e*tsi" *y(0) 


Now we tackle the particular solution that comes from the inputs g(t) when they grow. 
Again this y,(t) can come from an integrating factor or variation of parameters or 
an integral of all outputs from all inputs. 


t 
— fa(s)d 
1. The integrating factor is M(t) = 1/G(t) =e ie  Thishas M! = —a(t)M. 


Then the derivative of My is exactly Mq, when we use M' = —aM. 


Product rule 


d 
Chain rule a (My) = My! + M'y = M(y' — a(t)y) = Ma(t). (10) 


Integrate both sides of (My)’ = Mq starting from y,(0) = 0. Then divide by M : 


i t t s 
J a(s) ds — fa(s) ds 
M(thup(t) =f M(s)a(s)ds— p(t) =e fe gs)ds_— ay 
0 


When you multiply those exponentials, the exponents combine. The integral from 0 to t, 
minus the integral from 0 to s, equals the integral from s to t. Each q(s) enters at s. 
The exponential of the integral of a from s to ¢ is the growth factor G(s, t) : 


t 
fa(T) aT / 
Growth factor G(s, t) = e§ Solution y,(t) =[G(s,t) q(s)ds (12) 
0 
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2. Variation of parameters . I will save this method to use in Chapter 2 for second 
order equations (with y”’). Then all three methods get an equal chance—variation of 
parameters can solve equations that go beyond y’ = a(t)y + q(t). 


3. Integral of outputs (my own choice). The input qg(s) enters at time s. It grows 
or decays until time t. The growth factor multiplying q over that time is G(s,t). 
Since a(t) changes, the growth factor needs the integral of a. The inputs are q(s), 
the outputs are G(s, t) q(s), and the total output y,(¢) agrees with (12): 


up(t) = : G(s,t) q(s) ds (13) 
(8) 


When q is a delta function at time s (an impulse), the response is yp = G(s, t) at time t. 
Example1 The growthrate a(t) = 2t puts the economy into serious inflation. The integral 


t 
of a(t) is [ 27'dT = t? — s?. Then Gis the growth from s tot: 


t 
G(s,t)=e-*" sy! = 2ty +. q(t) has yp(t) = peer er q(s) ds. 
0 


Example 2 Here is an interesting case for investors. Suppose the interest rate a goes to 
zero. What happens to the solution formula? The first term y,, becomes y(0). This deposit 
doesn’t grow or disappear, it stays fixed. The growth factor is G = 1 and we just add up all 
the inputs (they didn’t grow) : 
t 
a=0 y’ = q(t) has the particular solution y,(t) = Ja) ds. 
0 


The problem comes when we start with the formula to solve y’ = ay + q (constant q): 


t 


y(t) = e*y(0) + fee ds = e“y(0) +q 
0 


ea | 


a . 


That looks bad at a = 0 because of dividing by a. But the factor e% — 1 is also zero. 
This is a case for l’H6pital’s Rule. Wonderful! We can make sense of 0/0: 
et —] Derivative with respect to a t 


limit = se sm Cr 
a7 0 a Derivative with respect to a 1 


The particular solution from y’ = q reduces to q times t. That is the total savings during 
the time from 0 to t. With a = 0 it doesn’t grow. Like putting money under a mattress, 
a = 0 means norisk and no gain. Then dy/dt = q has y(t) = y(0) + at. 


Now the solution formula can be applied to real problems. 
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Models of Growth and Decay 


The whole point of a differential equation is to give a mathematical model of a practical 
problem. It is my duty to show you examples. This section will offer growth equations 
(a > 0), decay equations (a < 0), and the balance equation that controls the temperature of 
the Earth. That balance equation is not linear. 

Please understand that a linear equation is only an approximation to reality. The 
approximation can be very good over an important range of values. Newton’s Law F = ma 
is linear and we live by it every day. But Einstein showed that the mass m is not a constant, 
it increases with the velocity. We don’t notice this until we are near the speed of light. 

Similarly the stretch in a spring is proportional to the force—for a while. A really large 
force will stretch the spring way out of shape. That takes us to nonlinear elasticity. Eventually 
the spring breaks. 

The same for analysis of a car crash. Linear at very slow speed, nonlinear at normal 
speeds, total wreck at high speeds. A crash is a very difficult problem in computational 
mechanics. So is the effect of dropping a cell phone. This has been studied in great detail. 


Back to linear equations, starting with constant a and y(0) and q. 
Model 1 y(t) = money in a savings account 


This is the example we already started. We have a formula for the answer, now we use it. 
That formula is based on a continuous savings rate q(t) (deposits every instant, not every 
month). It also has continuous interest ay (computed every instant, not every month or every 
year). Continuous compounding does not bring instant riches. Just a little more income, by 
computing interest day and night. 

Suppose we get 3% interest. This number is a = .03, but what are the “units” of a? The 
rate is 3% per year. There is a time dimension. If we change to months, the same rate is 
now a = 4,% = .0025 per month. 


Units of a are To change from years to months, divide a by 12. 

You can see this in the equation dy/dt = ay. Both sides have y. So a on the right 

agrees dimensionally with 1/t on the left. Frequency is also 1/time; iw — a is good! 
The savings rate q has the same dimension as ay. The dimension of q is money / time. 

We see that in the words too: g = 100 dollars per month. 


Question: Does y(t) grow or decay? This depends on y(0) and a and g. 


So far a and g have been positive; we were saving. If we spend money constantly, 
then gq changes to negative. Interest is still entering because a is positive. Does q win or 
does a win? Do we spend all our deposit and drop to y = 0, or does the interest ay(t) 
allow us to keep up the spending level g forever? 


Answer: If we start with ay(0) + q > 0, then y(t) will grow even if q < 0. 


The reason is in the differential equation dy/dt = ay(t) + q. If the right side is positive at 
time ¢ = 0, then y starts growing. So the right side stays positive, and y keeps growing. 
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Common sense gives the same answer: If ay + q > 0, the interest ay coming in stays 
ahead of the spending going out. 

A question for you. Suppose a < 0 but gq > 0. Your investment is going down at rate a. 
You are adding new investments at rate g. Overall, does your account go up or down? 

You won’t actually hit zero, because e® stays positive forever, even if a < 0. You 
approach the steady state y.. = —q/a. In reality, the end of prosperity has come. 


Now I will compare continuous compounding (expressed by a differential equation) 
with ordinary compounding (a difference equation). The difference equation starts with the 
same Yo = y(0). This changes to Y; and then Y2 and Y3, taking a finite step each year. 
When the time step At is one year, the interest rate is A per year and the saving rate is 
Q dollars per year : 


ney 


d n 
ae = ay+q changes to oe = AY, +Q (14) 


at 


We don’t need calculus for difference equations. The derivative enters when the time 
step At approaches zero. The model looks simpler if I multiply equation (14) by At: 


One step, n ton+1 Yngi = (1+ A At)Y, + Q At (15) 


At the end of year n, the bank adds interest AAtY,, to the balance Y,, you already have. 
You also put in new savings (or you spend if @ < 0). The new year starts with Y;,41. 
In case A At = at/N and Q = 0, we are back to Yn41 = (1+ at/N)Yn: 


at\% 
N steps from 0 to N Yn = (1 + =) Yo > e%y(0) as N > oo. 


Model 2 Radioactive Decay 


The next models will deal with decay. The growth rate a is negative. The solution y 
is decreasing. Decay is an expected and natural result when a < 0. In fact the differential 
equation 1s called stable when all solutions approach zero. In many applications this is highly 
desired. 

Exponential growth with a > 0 may be good for bank accounts, but not for a drug in our 
bloodstream. Here are examples where any starting amount y(0) decays exponentially: 


A radioactive isotope like Carbon 14 
Newton’s Law of Cooling 
The concentration of a drug in our bloodstream 


I will emphasize the half-life—the time for half of the Carbon 14 to decay, or half 
of the drug to disappear. This is decided by the decay rate a < 0 in the equation y’ = ay. 


The half-life H is the opposite of the doubling time D, when a > 0 and e?? = 2. 
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Half-life and Doubling Time 


How long does it take for y(t) to be reduced to half of y(0) 2? The equation y’ = ay has the 
solution e*y(0), and we know that a < 0. 
1 1 —In2 
Half-life H ero Soins ==? n= 
2 2 a 
That answer #7 is positive because a < 0. For Carbon 14 the half-life H is 5730 years. 
It has just taken 150 hours on a Cray XT5 supercomputer to find 8 eigenvalues of 
a matrix of size 1 billion—to explain that long half-life. Other carbon isotopes have 
H = 20 minutes. Going in reverse, H tells us the decay rate: 


_ —In2 


Decay rate a a= ~~ 1.216 x 1074 per year. 


The “quarter-life” would be 2H, twice as long as the half-life. The time to divide by e is 


—1 
Relaxation time 7 e*7 =e"! = 0.368 aT=-l T= 
a 


Question. Suppose we find a sample where 60 % of the Carbon 14 remains. How old 
is the sample? If the carbon came froma tree, its decay started at the 
moment when the tree died. 


Answer. — The age T is the time when e®7 = 0.6. At that time 
aT = In(0.6) i = = 4200 years. 
The doubling time D uses the same ideas but now the growth rate isa > O: 
_ In2 


a 


Doubling time et? —2 aD =In2 D 


At 5% interest (a = .05/year) the doubling time is less than 14 years. Not 20 years. 
Model 3 Newton’s Law of Cooling 


When you put water in a freezer, it cools down. So does a cup of hot coffee on a table. 
The rate of cooling is proportional to the temperature difference. 


d 
Newton’s Law aE = k(T.. — T) Too = surrounding temperature 


This is a linear constant coefficient equation. The solution approaches T,.. Include that 
constant on the left side, to make the equation and the solution clear: 


d(T _ Too) 


qo = h(Too — 7) T — Too. = e~**(T — To) 
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Question. Suppose the starting temperature difference Tp7 — To. is 80°. After 90 minutes 
the difference T; — TJ has dropped to 20°. At what time will the difference be 10° ? 
When will the temperature reach T. ? 


Answer. The starting difference 80° is divided by 4 in 90 minutes. To divide again by 2 
takes 45 minutes from 20° to 10°. There you see a fundamental rule for exponentials : 


If e°°* — 1/4 then e*°* = \/1/4 = 1/2. It is not necessary to know k. 
The temperature never reaches T., exactly. The exponential e~** never reaches 0 exactly. 
Model 4 Drug Elimination 


The concentration C(t) of a drug in the bloodstream drops at a rate proportional to C(t) 
itself. Then dC/dt = —kC. The elimination constant k > 0 is carefully measured, and 
Ch=E"C OU). 

Suppose you want to maintain at least G grams in your body. If you are taking the drug 
every 8 hours, what dose should you take ? 


t = 8 hours k = decay rate per hour Take e®*G grams. 


Model 5 Population growth 


Certainly the world population is increasing. Its growth rate a is the birth rate minus the death 
rate. A reasonable estimate for a right now is 1.3% a year, or a = .013/year 
(the dimension of a is 1/time). A first model assumes this growth rate to be constant, 
continuing forever: Now we ask for the doubling time, a number that is independent of 
the starting value y(0) : 


In 2 
Doubling time D Ce aa ae oe we years = 53 years. 


d 
World population a =3013y> ‘and’ y(t) =e 97"4(0): 


The “forever” part is unrealistic. After 1000 years, it produces e19y(0). That number e1? 
is enormous. If we start today (so that ¢ = O is the year we are living in) 
then eventually we will have about one atom each. Ridiculous. But it is quite possible 
that the pure growth equation y’ = ay does describe the real population for a short time. 

Eventually the equation has to be corrected. We need a nonlinear term like —by?, 
to model the effect of competition (y against y). As y gets large, y? gets much larger. 
Then —by? subtracts from dy/dt and eventually competition stops growth. 

This is the famous “logistic equation” dy/dt = ay — by”. It is solved in Section 1.7. 
Here I want to end with a problem of scientific importance—the changing temperature of the 
Earth. The equations are nonlinear. The data is incomplete. There is no solution formula. 
This is the reality of science. 
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Energy Balance Equations 


The Earth gets practically all its energy from the Sun. A lot of that energy goes back out 
into space. This is radiation in and radiation out. The energy that doesn’t go back is 
responsible for changing the Earth’s temperature T’. 

This energy balance is crucial to our lives. It won’t permit life on Mercury (too hot), and 
certainly not on Pluto (too cold). We are extremely fortunate to live on Earth. The form of 
the temperature equation is completely typical of balance equations in applied mathematics : 


Energy in minus energy out dT 
. Ee By C— = Ein — Sout (16) 
This raises the temperature T dt 


There is a coefficient C' in every equation like this. Let me show you another balance equa- 
tion, to emphasize how the problem can change but the form stays the same. 


Flow into a bathtub minus flow out A dH as = on 
This raises the water height H dite out 


The tap controls the incoming flow Fj,. The drain controls the outgoing flow Fou. The 
volume of water changes according to dV/dt = Fi, — Fou. That volume change dV/dt 
is a height change dH /dt multiplied by A = area of the water surface. Check units: 


H = meters A= (meters)? V =(meters)> t=seconds F' = (meters)*/second 


I include this bathtub example because it makes the balance clear : 
1. Flow rate in minus flow rate out equals fill rate dV/dt. 
2. Volume change dV/dt splits into (A) (dH /dt) = area times height change. 


In a curved bathtub, the water area A changes with the height H. Then equation (17) 
is nonlinear. Every scientist looks immediately at the balance equation: Can it be linear ? 
Can its coefficients be constant? The true answer is no, the practical answer is often yes. 
(Numerical methods are slowed by nonlinearity. Analytical methods are usually destroyed.) 


Energy Balance for the Earth 


The energy balance equation CT’ = Ex, — Eouw is the start. Temperature is in Kelvin 
(degrees Celsius are also used). The heat capacity C' is the energy needed to raise the 
temperature by 1 degree (just as the area A was the volume of water that raises the height 
of water by 1 meter). That heat capacity C’' truly changes between ice and ocean and land. 
Exactly as predicted, the starting simplification is C = constant. 
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On the right side of the equation, the energy Ej, is coming from the Sun. A serious 
fraction a of the arriving energy bounces back and is never absorbed. This fraction a is the 
albedo. It can vary from .80 for snow to .08 for ocean. On a global scale, we have to simplify 
the albedo formula to a constant, and then improve it : 


60 if T<255K 


Constant a = .30 for all T Piecewise linear ~ = { 20 if T>290K 


The main point is that Ein = (1 — @)Q, where Q measures energy flow from the Sun 
to a unit area of the Earth. Now we turn to Epuy. 

Radiation of energy is theoretically proportional to 7“ (the Stefan-Boltzmann law). There 
is an ideal constant o from quantum theory, but the Earth is not ideal. The “greenhouse 
effect” of particles in the atmosphere reduces o by an emission factor close to 


€ = .62. For a unit area, the radiation Ey is €o T* and the radiation Ej, is (1 — a)Q: 
—_ 1/4 
Energy balance Fin = Eout (1-—a)Q=coT* r= (<=?) 
€o 


You understand that these are not fixed laws like Einstein’s e = mc?. Satellites measure 


the actual radiation, sensors measure the actual temperature. That nonlinear T* formula 
is often replaced by a linear A + BT’. This gives the most basic model of a steady state. 


Multiple Steady States 


I will take one more step with that model—we are on the edge of real science. You know 
that the albedo a@ (the bounceback of solar energy) depends on the temperature JT. The 
coefficients A and B and ¢€ also depend on 7. The temperature balance equation 
CdT/dt = Ei, — How and the steady equilibrium equation Fi, = Eo are not linear. 
From a nonlinear model, what can we learn? 


Point1  £i,(T) = Eou(Z') can easily have more than one solution T. 


Point 2 Those steady states when dT'/dt = 0 can be stable or unstable. 
Point3  Youcan see 7} and 73 (stable) and T> (unstable) in this graph of Ej, and Eout. 


Why is T> unstable? If T is just above To, then Ein > Eom. Therefore dT/dt > 0 
and the temperature climbs further away from T>. If T is just below T, then Fin < Fout. 
Therefore dT/dt < 0 and T falls further below 7». 

The next section 1.7 shows how to decide stability or instability for any equation 
dT/dt = f(T) or dy/dt = f(y). Just as here, each steady state has f(T) = 0. 
Stable steady states also have df /dT < 0 or df /dy < 0. Simple and important. 
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- 
Ein = 1000(0.5 + 0.2 « tanh((T — 265)/10))/3 : 


220 230 240 250 260 270 280 290 300 
Ty = 235 Tp = 264 T3 = 290 


Figure 1.8: The analysis and the graph are from Mathematics and Climate by Hans Kaper 
and Hans Engler (SIAM, 2013). Fin — Eout has slope < 0 at two stable steady states. 


Problem Set 1.6 


1 Solve the equation dy/dt = y + 1 up to time ¢, starting from y(0) = 4. 


2 You have $1000 to invest at rate a = 1 = 100%. Compare after one year the result 
of depositing y(0) = 1000 immediately with g = 0, or choosing y(0) = 0 and 
= 1000/year to deposit continually during the year. In both cases dy/dt = y + q. 


3 If dy/dt = y — 1, when does your original deposit y(0) = 5 drop to zero? 


d 

4 Solve = = y +t? from y(0) = 1 with increasing source term t?. 
d 

5 Solve a = y + e' (resonance a = c!) from y(0) = 1 with exponential source e’. 
dy 


6 = Solve 7 t? from an initial deposit y(0) = 1. The spending q(t) = —t? is 
growing. When (if ever) does y(t) drop to zero ? 


d 
7 Solve = = y —e* from an initial deposit y(0) = 1. This spending term —e* grows at 


the same e’ rate as the initial deposit. When (if ever) does y drop to zero ? 


d 
8 Solve = = y—e” from y(0) = 1. At what time T is y(T) = 0? 


1.6. Models of Growth and Decay 51 


9 


10 


11 


12 


Which solution (y or Y) is eventually larger if y(0) = 0 and Y(0) = 0? 


dy dY 

—= 2t SS DY ET: 

a os "dE 
Compare the linear equation y’ = y to the separable equation y’ = y? starting from 
y(0) = 1. Which solution y(t) must grow faster ? It grows so fast that it blows up to 
y(T’) = cw at what time T ? 


Y’ = 2Y has a larger growth factor (because a = 2) than y’ = y + q(t). 
What source g(t) would be needed to keep y(t) = Y(t) for all time ? 


Starting from y(0) = Y(0) = 1, does y(t) or Y(t) eventually become larger ? 


dy t dY Qt 
—=2 —=Y : 
er yte dk +e 


Questions 13-18 are about the growth factor G(s, ¢) from time s to time ¢. 


13 


14 


15 


16 


17 


What is the factor G(s, s) in zero time ? Find G(s,0oo) if a = —1 andifa = 1. 


Explain the important statement after equation (13): The growth factor G(s, t) is the 
solution to y' = a(t)y + 6(t — s). The source 6(t — s) deposits $1 at time s. 


Now explain this meaning of G(s, t) when t is less than s. We go backwards in time. 
Fort < s, G(s, t) is the value at time t that will grow to equal | at time s. 


When t = 0, G(s,0) is the “present value” of a promise to pay $1 at time s. If 
the interest rate is a = 0.1 = 10% per year, what is the present value G(s, 0) of 
a million dollar inheritance promised in s = 10 years ? 


(a) What is the growth factor G(s, t) for the equation y’ = (sin t)y + Q sin t? 
(b) What is the null solution y,, = G(0, t) to y’ = (sin t)y when y(0) = 1? 


t 
(c) What is the particular solution y, = [ G(s,t) Q sin sds? 
0 


(a) What is the growth factor G(s, t) for the equation y’ = y/(t + 1)+ 10? 
(b) What is the null solution y,, = G(0, t) to y’ = y/(t + 1) with y(0) = 1? 


t 
(c) What is the particular solution yp = 10 f G(s, t) ds ? 
0 


Why is G(t, s) = 1/G(s,t) 2? Why is G(s, t) = G(s, S)G(S, t) ? 
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Problems 19-22 are about the “units” or “dimensions” in differential equations. 


19 (recommended) If dy/dt = ay + qe, with t in seconds and y in meters, what are 
the units for a and gq and w ? 


20 = The logistic equation dy/dt = ay — by” often measures the time ¢ in years (and y 
counts people). What are the units of a and b ? 


21 + Newton’s Law is md’y/dt? + ky = F. If the mass m is in grams, y is in meters, 
and t is in seconds, what are the units of the stiffness & and the force F’ ? 


22 Why is our favorite example y’ = y + 1 very unsatisfactory dimensionally ? Solve it 
anyway starting from y(0) = —1 and from y(0) = 0. 


23 ~—- The difference equation Y,.1 = cYn + Qn produces Y; = cYp + Qo. Show that the 
next step produces Y2 = c?Yp + cQo + Q1. After N steps, the solution formula for Yy 
is like the solution formula for y’ = ay + q(t). Exponentials of a change to powers of 
c, the null solution e®y(0) becomes c% Yo. The particular solution 


t 
Y= 1Qo +--+ Qua istike y(t) = [et a(s)as. 
0 


24 Suppose a fungus doubles in size every day, and it weighs a pound after 10 days. 
If another fungus was twice as large at the start, would it weigh a pound in 5 days ? 


1.7. The Logistic Equation 53 


1.7 The Logistic Equation 


This section presents one particular nonlinear differential equation—the logistic equation. 
It is a model of growth slowed down by competition. In later chapters, one group y; will 
compete against another group y2. Here the competition is inside one group. The growth 
comes from ay as usual. The competition (y against y) comes from —by?. 


d 
Logistic equation / nonlinear nt = ay — by? (1) 


We will discuss the meaning of this equation, and its solution y(t). 

One key idea comes right away: the steady state. Any time we have dy/dt = f(y), it 
is important to know when f(y) is zero. Growth stops at that point because dy/dt is zero. 
If the number Y solves f(Y) = 0, the constant function y(t) = Y solves the equation 
dy/dt = f(y): both sides are zero. For the special starting value y(0) = Y, the solution 
would stay at Y. It is a steady solution, not changing with time. 

The logistic equation has two steady states with f(Y) = 0: 


oY = ay — by? =0 when aY = bY”. Then Y =O or Y = /b. (2) 
That point a/b is where competition balances growth. It is the top of the “S-curve” 
in Figure 1.9, where the curve goes flat. It is the end of growth. The solution y(t) cannot 
get past the value a/b. At the start of the S-curve, the other steady state Y = 0 is unstable. 
The curve goes away from Y = 0 and toward Y = a/b. 

In some applications, this number a/b is the carrying capacity (A) of the system. 
If a/b = K then b = a/K. So the logistic equation can be written in terms of a and 
K: 


d 
FD = ay ~ ty? = ay- Fy? = ay (1-2). (3) 


Mathematically, we have done nothing interesting. But the number K may be easier to 
work with than b. We might have an estimate like kK = 12 billion people for the maximum 
population that the world can deal with. Rewriting the equation doesn’t change the solution, 
but it can help our understanding. 


Solution of the Logistic Equation 


What is y(t) ? The logistic equation is nonlinear because of y”, and most nonlinear equations 
have no solution formula. (y = Ce is extremely unlikely.) But the particular equation 
dy/dt = ay — by? can be solved, and I want to present two ways to do it: 
1 (by magic) The equation for z = 1/y happens to be linear: dz/dt = —az + b. 
We can solve that equation and then we know y. 
2 (by partial fractions) This systematic approach takes longer. In principle, partial 
fractions can be used any time dy/dt is a ratio of polynomials in y. 
You will appreciate method 1 (only two steps A and B) after you see method 2. 
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1 d —ld d 
(A) If z = r the chain rule gives > = aa ae Substitute ay — by? for > : 
= ay + by) =-240 +b (4) 
— ay + So =—Aaz : 
dt y2 y Y y 
(B) This is the linear equation z’ + az = 6b that was solved in the previous sections. 
Change a to — a in the solution formula. Change y and q to z and b: 
b de~*t +b 
Solution 2S 202 (esa) ee (5) 
a a 
The number d collects all the constants a, y(0), b in one place: 
d b 1 a 
— = 2(0)——- and z(0) = —~ produce d = —— — (6) 
ga 1) y(0) 
Now turn equation (5) upside down to find y = 1/z: 
a 
Solution to the logistic equation t) = ———___ a 
gistic eq y(t) aera (7) 


This is a beautiful solution. Look at its value for large positive t and large negative t : 


a 
Approaching t = +00 et +0 and y(t) > ‘i 
Approaching t = —oo e~** +00 = and y(t) 3 0 

Far back in time, the population was near Y = 0. Far forward in time, the population 


will approach Y = a/b. Those are the two steady states, the points where ay — by? 
is zero and the curve becomes flat. Then dy/dt is zero and y never changes. 

In between, the population y(t) is following an S-curve, climbing toward a/b. It is 
symmetric around the halfway point y = a/2b. The world is near that point right now. 


y = a/2b inflection point 


halfway time 


Figure 1.9: The S-curve solves the logistic equation. The inflection point is halfway. 
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Simplest Example of the S - curve 


The best example has a = b = 1. The top of the S-curve is Y = a/b = 1. The bottom 
is Y = 0. The halfway time is ¢ = 0, where y(0) = 5 Then the logistic equation and its 
solution are as simple as possible : 


d 
Seat y — y” hasthesolution y(t) = 


1 1 
——— starting fi 0) =-. 8 
ah eee starting from y(0) 5 (8) 


That solution 1/(1 + e~*) approaches 1 when t — oo. It approaches 0 when t + —oo. 
Let me review the “z = 1/y method” to solve the logistic equation y’ = y — y’. 


dz —ldy  -yt+y? | 


= Pal. 
dt —-y? dt y2 i 


1 
Then z(t) = 1+ Ce-*. Take C = 1 to match y(0) = 5 and 2(0) = 2. Nowy = Tage: 
e 


World Population and the Carrying Capacity kK 


What are the numbers a and b for human population ? Ecologists estimate the natural growth 
rate at a = .029 per year. This is not the actual rate, because of b. About 1930, the world 
population was near y = 3 billion. The ay term predicts a one-year increase of (.029) (3 
billion) = 87 million. The actual growth was more like dy/dt = 60 million/year. In this 
simple model, that difference of 27 million/year was caused by by? : 
27 million/year = b (3 billion)? leads to b—=3 times 10~'*/year. 

When we know 8, we know the steady state y(oo) = K = a/b. At that point the loss by? 
from competition balances the gain ay from growth: 

029 

Estimated capacity K = ; = ae les ~ 9.7 billion people. 


This number is low, and y is growing faster. The estimates I see now are closer to 
y(co) > 10 billion and y(2014) ~ 7.2 billion. 


Our world is beyond the halfway point y = a/2b on the curve. That looks like an 
inflection point (by symmetry of the graph), and the test d?y/dt? = 0 confirms that it is. 


The inflection point with y ’’ = 0 is halfway up the curve in Figure 1.9 


d (dy d dy a 
—{— ) = —(ay — by?) = (a — 2by)— =0 wh =— 

a (BL) = Glau by?) = (a= 2b) FE =0 when y = 5 (9) 
After this halfway point, the S-curve bends downward. The population y is still increasing, 
but its growth rate dy/dt is decreasing. (Notice the difference.) The inflection point 
separates “bending up” from “bending down” and the rate of growth is a maximum at 


that point. You will understand that this simple model must be and has been improved. 
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Partial Fractions 


The logistic equation is nonlinear but it is separable . We can separate y from t as follows : 


dy =ay—by? =a (u a i?) leads to ae Ee = a dt. (10) 
dt a y— oy? 
In this separated form, the problem is reduced to two ordinary integrations (y-integration on 
the left side, t-integration on the right side). The integral of a dt on the right side is certainly 
at + C. The left side can be looked up in a table of integrals or produced by software like 
Mathematica or discovered by ourselves. 
I will explain the idea of partial fractions that produces this integral. You may know it as 
a “Technique of Integration” from first-year calculus (it is really just algebra). 
The plan is to split the fraction in two pieces so the integration becomes easy : 


(11) 


1 A 
Partial fractions ———— _ separatesinto —-+ 
Y= sy y 


I factored y — by? into y times 1 — by, I put those two denominators on the right side. 
We need to know A and B. To compare with the left side, combine those two fractions: 


A B A(1—2y)+B 
Common denominator =e <= ( a) st a (12) 
ae y(1— gy) 
The correct A and B must produce 1 in the numerator, to match the 1 in equation (11): 
b b 
A{1l—-y)+By=1 when A=1 and B=-. (13) 
a a 
This completes the algebra of partial fractions, by finding A and B in equation (11): 
‘ 1 1 1 b/a 
Two fractions ——— = = - + : (14) 


y—2y? y(l-4y) y 1—2y 


Integrate the Partial Fractions 


With A = 1 and B = b/a, integrate the two partial fractions separately : 


1 dy ‘i (b/a)dy ( b ) 
at ff = = Iny—-In[(1-—y}. (15) 
i y 1 — (b/a)y a 
This is the calculus part (the integration) in solving the logistic equation. After the 
integration, use algebra to write the answer y(t) in a good form. 
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Actually that good form of y(t) was already found by our first method. The magic 
of z = 1/y produced a linear equation dz/dt = —az+ b. Then returning to y = 1/z 
put the crucial factor e~°* into the denominator of (7), and we repeat that solution here : 


Solution in (7) y(t) = ae with d= =O) — 6. (16) 


This same answer must come from the integral (15) that used partial fractions. The 
integral has the form Iny — Ina, which is the same as In(y/zx) (and x is 1 — (b/a)y). 


r 0 
t= [oat gives In eS SOS ate (17) 
y—2y 1—by 1— Gy(0) 


I chose the integration constant C' to make (17) correct at t = 0. Now take exponentials 
of both sides : 


y = ae y(0) 
eas By = € ee ay By(0)’ (18) 


The final algebra part is to solve this equation for y. Let me move that into Problem 3. Then 
we recover the good formula (16) that came so much faster from y = 1/z. 

Looking ahead, partial fractions will appear again in Section 2.7. They simplify the 
Laplace transform so you can recognize the inverse transform. That section gives a formula 
PF2 for the numbers A and B in the fractions—it is previewed here in Problem 14. 


Again, we solved dy/dt = f(y) by separating [ dy/f(y) from f[ dt. 


Autonomous Equations dy/dt = f(y) 


The logistic equation is autonomous. This means that f depends only on y, and not on ¢: 
dy/dt = f(y). A linear example is y’ = y. The big advantage of an autonomous equation 
is that the solution curve can stay the same, when the starting value y(0) is changed. “We 
just climb onto the curve at height y(0) and keep going.” 


You saw how Figure 1.9 had the same S-curve for every y(0) between 0 and a/b. 
The equation dy/dt = y has the same exponential curve y = e° for every y(0) > 0. 
Just mark the t = 0 point wherever the height is y(0). 


This means that time ¢ is not essential in the graphs. The graph of f(y) against y is the 
key. For the logistic equation, the parabola f(y) = ay — by? tells you everything (except the 
time for each y). y(t) increases when this parabola f(y) is above the axis 
(because dy/dt > 0 when f > 0). So I only drew one S-curve. 

There is also a decreasing curve starting from y(0) > a/b. It approaches the steady 
state Y = a/b from above. Another curve starts below Y = 0 and drops to —oo. The up- 
going S-curve is sandwiched between two downgoing curves, because in Figure 1.10 
the positive piece of ay — by? is sandwiched between two negative pieces. 
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Stability of Steady States 


The steady states of dy/dt = f(y) are solutions of f(Y) = 0. The differential equation 
becomes 0 = 0 when y(t) = Y is constant (steady). Here is the stability question: 


Starting close to Y, does y(t) approach Y (stable) or does it leave Y (unstable) ? 


We had a formula for the S-curve. So we could answer this stability question. One Y is 
stable (that is Y = a/b at the end). The steady state Y = 0 is unstable. It is important 
(and not hard) to be able to decide stability without a formula for y(t). 

Everything depends on the derivative df/dy at the steady value y = Y. That slope 
of f(y) will be called c. Here is the test for stability, followed by a reason and examples. 


Stable if c < 0 The steady state Y is stable if df/dy < Oaty = Y. 


Reason: Near the steady state, f(y) is close to c(y — Y). Then y’ = f(y) is close to 
(y—Y)' =c(y—Y). Then y — Y is like et, and y + Y when c < 0 and e™ —> 0. 

Let me explain in detail for any autonomous equation dy/dt = f(y). Suppose that 
Y = Oisa steady state. This means that f(0) = 0. Calculus gives the linear approximation 
f(y) © cy, where c is the slope of the tangent line. That number is c = df /dy at Y = 0. If 
c is negative then y(t) will move toward Y = 0 (stability) : 


For small y(0) > 0 dy/dt = f(y) ¥ cy <0 y(t) decreases toward 0 
For small y(0) < 0 dy/dt = f(y) = cy >0 y(t) increases toward 0 


For any other steady state Y, calculus gives the linear approximation f(y) ¥ c(y — Y). 
Now that number is c = df /dy, the slope of the tangent line at y = Y. 


For y(0) just above Y dy/dt = f(y) 
For y(0) just below Y dy/dt = f(y) 


c(y-Y) <0 y(t) decreases toward Y 
cly—Y)>0 y(t) increases toward Y 


~ 
~ 
~ 
~ 


Example 1 (logistic) The derivative of ay — by? is df /dy = a — 2by. 
At the steady state Y = 0, df/dy isa > 0: Y = Ois unstable. 
At Y = a/b, this derivative is a — 2b(a/b) = —a. Y = a/bis stable. 
For dy/dt = ay — by? this stability line shows which way y(t) moves from any y(0). 


If y(0) is here, Y=0 If y(0) is here, Y=a/b If y(0) is here, 
a ae] a { << —<¢_—$______ 
then y(t) goes to — co then y(t) goes to a/b then y(t) goes to a/b 


The steady states have to alternate between stable and unstable, because df/dy will 
alternate between negative and positive. JI am excluding the undecided cases when 
f(Y) = 0 and also df/dy(Y) = 0. This is a borderline case for critical harvesting. 
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The Harvesting Equation 


Suppose the logistic equation also includes a constant harvesting rate —h. This will 
reduce the growth rate dy/dt. Let me start with the logistic equation dy/dt = 4y — y?, 
where the S-curve rises from Y = 0 to the other steady state Y = a/b = 4/1. If the new 


harvesting term is —h = —3, the steady states change from 0 and 4 to 1 and 3: 
dy _ 2 = = 
ae 4y — y~ —3  hasnewsteady stats Y=1 and Y=3. (19) 


I found 1 and 3 by factoring 4Y — Y? — 3 into —-(Y — 1)(Y — 3). Those populations Y = 1 
and Y = 3 are the points where the equation is dy/dt = 0. Then y = Y stays steady. 


f(y) = ay — by? f(y) =4y-y? —h 


—sse ee 
— 
* ay 


Y o ~~ Y 


Y =a/b 
Logistic slope < 0 —5 Harvesting 
Y is stable 


Figure 1.10: Harvesting lowers the parabola f(y) = ay — by? — h. Steady Y’s disappear. 


This figure shows the stability or instability of the steady states. Y = 0 in the logistic 
graph and Y = 1 in the harvesting graph are unstable. At those points f(y) climbs from 
negative to positive. Above Y, the graph shows dy/dt = f(y) as positive. So y(t) will 
increase, and it moves away from Y . 

Y = a/b in the logistic graph and Y = 3 in the harvesting graph are stable. Beyond 
those points f(y) is negative. This is dy/dt. So y(t) decreases back toward Y. The graphs 
are a little tricky to read, because they don’t show y(t). They show the phase plane with 
y’ = f(y) against y : Velocity versus position, not position versus time ! 


Looking again at the figure, h = 4 gives critical harvesting: One double stationary 
point Y = 2. That curve shows dy/dt = f(y) as always negative, so y(t) will 
decrease. If y(0) is greater than 2, then y(t) must come back toward Y = 2. But this is 
one-sided stability, because if y(0) is smaller then 2, then y(t) will decrease and go 
far away from 2. 

The lowest curve has h = 5 and no steady states. At all points dy/dt = f(y) is 
negative. All solutions y(t) are decreasing. If we can find a formula for y(t), we can watch 
this happen: y(t) +> —oo. The logistic and harvesting equations are terrific nonlinear 
examples, because we can actually find y(t). 
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Solving the Harvesting Equation 


We have three types of harvesting equations, with 2 or 1 or 0 steady states : 
h<4  y!=4y—y? —hwill reduce to a logistic equation: underharvesting 
h=4  y’=-—(y—2)* has a double steady state: critical harvesting 
h>4__ y’ stays below zero and y(t) approaches —oo : overharvesting. 


All these equations are autonomous, so they separate into dy/ f(y) = dt. Integrate 1/f (y). 
Smallh =3 Factor f(y) into—(y—1)(y-3)  Then¥Y =1landY =3 


Let me shift those steady states down to V = 0 and V = 2, by shifting y(t) to 
v(t) = y(t) — 1. The equation for u(t) is logistic, and its S-curve climbs from 0 to 2: 


(l+v)’ = —(v)(v—2)is v’ = 2v — v? (20) 
When you add back the 1 to get y = 1 + v, its S-curve climbs from 1 to 3. 


Critical h = 4 Factor f(y) = 4y —y? -4= (y 2)? Then Y = 2 and 2 


The equation is y’ = —(y — 2)”. Shifting to v(t) = y(t) — 2 gives dv/dt = —v?. 
Page 1 of this book had the equation dy/dt = +? (with time going the other way). 
The solution looks so innocent: 


(t) = v(0) goes gently tov = 0 as t > oo provided v(0) > 0 
~ 1+4+tv(0) — goes suddenly to v = —oo when 1 + tu(0) = 0 


This shows (one-sided) stability if y(0) > 2 and v(0) > 0. 
When harvesting is more than critical, the population dies out from every y(0). 


Overharvesting h = 5 Write y’ = 4y — y? —5 = —-1—(y—2)*. Always y’ < 0. 
Now v = y — 2 simplifies the equation tov’ = —1 — v?. Integrate du/(1 + v2) = —dt to 


get tan-+v = —t+C. If v(0) = 0 then C = 0. Now go back toy =v +2: 


d 
= =—1-—v? with v(0) = 0 gives v(t) = tan(—t). Then y(t) = 2— tant. (21) 


When the tangent reaches 2, the population y = 0 is all gone. If the solution continues 
to t = 7/2, then tant is infinite. The model loses meaning and y(7/2) = —ov. 


Overall, I hope you see how a simple stability test tells so much about y’ = f(y): 


1 Find all solutions to f(y) = 0 2 Ifdf/dy < Oat y = Y, that state is stable. 
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= REVIEW OF THE KEY IDEAS #8 


1. The logistic equation dy/dt = ay — by? has steady states at Y = 0 and Y = a/b. 


2. The S-curve y(t) = a/(de~“ + b) approaches the carrying capacity y(oo) = a/b. 
‘ even : Dy 
3. The equation for z = — is linear! Or we can separate into dy/ | y — —y* ) = adt. 
y a 


4. The stability test df /dy = a — 2by < Ois passed at Y = a/b and failed at Y = 0. 


5. This stability test applies to all equations y’ = f(y) including y’ = ay — by? — h. 


Problem Set 1.7 


1 If y(0) = a/2b, the halfway point on the S-curve is at t = 0. Show that d = b and 


y(t) a a 1 


Gee pee Sketch the curve from y_.5 = 0 to Yo = . 


b . 


2 If the carrying capacity of the Earth is K = a/b = 14 billion people, what will be the 
population at the inflection point ? What is dy/dt at that point ? The actual population 
was 7.14 billion on January 1, 2014. 


3 Equation (18) must give the same formula for the solution y(t) as equation (16). 
If the right side of (18) is called R, we can solve that equation for y : 


b b R 
=R{(1-- 1 -jy= = ——__. 
y ( a) > ( Re) y R—> y (14 R2) 


Simplify that answer by algebra to recover equation (16) for y(t). 


4 Change the logistic equation to y/ = y + y?. Now the nonlinear term is positive, 
and cooperation of y with y promotes growth. Use z = 1/y to find and solve a 
linear equation for z, starting from z(0) = y(0) = 1. Show that y(T) = co when 
e~ 7 = 1/2. Cooperation looks bad, the population will explode at t = T. 


5 The US population grew from 313, 873, 685 in 2012 to 316, 128, 839 in 2014. If it 
were following a logistic S-curve, what equations would give you a, b, din the formula 
(4) ? Is the logistic equation reasonable and how to account for immigration ? 


6 The Bernoulli equation y/ = ay — by” has competition term by”. Introduce 
z = y'~-” which matches the logistic case when n = 2. Follow equation (4) to 
show that z’ = (n — 1)(—az + b). Write z(t) as in (5)-(6). Then you have y(t). 
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Problems 7-13 develop better pictures of the logistic and harvesting equations. 


7  y'=y-—y’ is solved by y(t) = 1/(de~* + 1). This is an S-curve when y(0) = 1/2 
and d = 1. But show that y(t) is very different if y(0) > 1 or if y(0) < 0. 
If y(0) = 2 then d = 5 — 1 = —4. Show that y(t) — 1 from above. 
If y(0) = —1 thend = + — 1 = —2. At what time T is y(T) = —00? 


8 (recommended) Show those 3 solutions to y’ = y — y? in one graph! They start 
from y(0) = 1/2 and 2 and —1. The S-curve climbs from 4 to 1. Above that, 
y(t) descends from 2 to 1. Below the S-curve, y(t) drops from —1 to —oo. 


Can you see 3 regions in the picture? Dropin curves above y = 1 and S-curves 
sandwiched between 0 and 1 and dropoff curves below y = 0. 


9 Graph f(y) = y — y? to see the unstable steady state Y = 0 and the stable Y = 1. 
Then graph f(y) = y — y? — 2/9 with harvesting h = 2/9. What are the steady states 
Y; and Y2? The 3 regions in Problem 8 now have Z-curves above y = 2/3, S-curve 
sandwiched between 1/3 and 2/3, dropoff curves below y = 1/3. 


10 What equation produces an S-curve climbing to yo = K from y_~ = L? 


Wy! =y-y?—§4 = —(y — $)? shows critical harvesting with a double steady state 
Atyy—Sys 4. The layer of S-curves shrinks to that single line. Sketch a dropin 
curve that starts above y(0) = 4 and a dropoff curve that starts below y(0) = 3. 

12 Solve the equation y’ = —(y — 5)” by substituting v = y — 4 and solving v’ = —v?. 

13 With overharvesting, every curve y(t) drops to —oo. There are no steady states. 
Solve Y — Y? — h = 0 (quadratic formula) to find only complex roots if 4h > 1. 


The solutions for h = 3 are y(t) = 5 —tan(t + C). Sketch that dropoff if 
C = 0. Animal populations don’t normally collapse like this from overharvesting. 


1 1 
14 With two partial fractions, this is my preferred way to find A = ~Bi= 
r—s 


io ee ty 6 


Check that equation: The common denominator on the right is (y — r)(y — s)(r — 8). 
The numerator should cancel the r — s when you combine the two fractions. 


1 1 B 
Separate ——— and —;—— into two fractions + ; 
Yee yoy at es 
Note When y approaches r, the left side of PF2 has a blowup factor 1/(y — r). 
The other factor 1/(y — s) correctly approaches A = 1/(r — s). So the right side 
of PF2 needs the same blowup at y = r. The first term A/(y — 1) fits the bill. 
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16 


17 


18 


19 


20 


21 


22 


23 


The threshold equation is the logistic equation backward in time: 


dy ; dy 
—— =ay—by? isthesameas — =— by?. 
dt ay y 1 dt ay + by 


Now Y = 0 is the stable steady state. Y = a/b is the unstable state (why ?). 
If y(0) is below the threshold a/b then y(t) —> O and the species will die out. 


Graph y(t) with y(0) < a/b (reverse S-curve). Then graph y(t) with y(0) > a/b. 
(Cubic nonlinearity) The equation y’ = y(1 — y)(2 — y) has three steady states : 


Y = 0,1,2. By computing the derivative df/dy at y = 0,1,2, decide whether 
each of these states is stable or unstable. 


Draw the stability line for this equation, to show y(t) leaving the unstable Y’s. 
Sketch a graph that shows y(t) starting from y(0) = 3 and 3 and 3. 
(a) Find the steady states of the Gompertz equation dy/dt = y(1 — Iny). 
(b) Show that z = In y satisfies the linear equation dz/dt = 1 — z. 


(c) The solution z(t) = 1 + e~*(z(0) — 1) gives what formula for y(t) from y(0) ? 
Decide stability or instability for the steady states of 
(a) dy/dt =2(1—y)(1—e¥) —(b)_ dy/dt = (1—y*)(4—y”) 


Stefan’s Law of Radiation is dy/dt = K(M*4—y*). Itis unusual to see fourth powers. 
Find all real steady states and their stability. Starting from y(0) = 1/2, sketch a graph 


of y(t). 


dy/dt = ay — y® has how many steady states Y fora < 0 and thena > 0? 
Graph those values Y(a) to see a pitchfork bifurcation—new steady states suddenly 
appear as a passes zero. The graph of Y (a) looks like a pitchfork. 


(Recommended) The equation dy/dt = sin y has infinitely many steady states. 
What are they and which ones are stable? Draw the stability line to show whether 
y(t) increases or decreases when y(0) is between two of the steady states. 


Change Problem 21 to dy/dt = (sin y)?. The steady states are the same, but now the 
derivative of f(y) = (sin y)? is zero at all those states (because sin y is zero). What 
will the solution actually do if y(0) is between two steady states ? 


(Research project) Find actual data on the US population in the years 1950, 1980, 
and 2010. What values of a,b, d in the solution formula (7) will fit these values? Is 
the formula accurate at 2000, and what population does it predict for 2020 and 2100? 


You could reset ¢ = 0 to the year 1950 and rescale time so that t = 3 is 1980. 
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24 = If dy/dt = f(y), what is the limit y(oo) starting from each point y(0) ? 


25 (a) Draw a function f(y) so that y(t) approaches y(co) = 3 from every y(0). 
(b) Draw f(y) so that y(oo) = 4 if y(0) > 0 and y(co) = —2 if y(0) < 0. 


26 Which exponents n in dy/dt = y” produce blowup y(T’) = oo in a finite time? 
You could separate the equation into dy/y” = dt and integrate from y(0) = 1. 


27 ‘Find the steady states of dy/dt = y? —y* and decide whether they are stable, unstable, 
or one-sided stable. Draw a stability line to show the final value y(oo) from each initial 
value y(0). 


28 =‘ For an autonomous equation y’ = f(y), why is it impossible for y(t) to be increasing 
at one time t, and decreasing at another time t2 ? 


The website math.mit.edu/dela has more graph questions for autonomous y’ = f(y). 


Notes on feedback The S-curve represents a good response from an elevator. The transient 
response in the middle of the S is the fast movement between floors. The elevator slows 
down as it approaches steady state (the floor it is going to). There is a feedback loop to tell 
the elevator how far it is from its destination, and control its speed. 

An open-loop system has no feedback. A simple toaster will keep going and burn your 
toast. The end time is entirely controlled by the input setting. A closed-loop system feeds 
back the difference between the state y(t) and the desired steady state y... A toaster oven 
can avoid burning by feeding back the temperature. 

The logistic equation is nonlinear because of its feedback term —by?. This is so common 
in other examples of movement and growth. Our brain controls arm movement and brings 
it to a stop. Your car has thousands of computer chips and controllers that measure position 
and speed, to slow down and stop before disaster. 

I admit that I don’t use cruise control because the car might keep cruising—I am not too 
sure it will stop. But it does have a feedback loop to keep the car below a set speed. 
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1.8 Separable Equations and Exact Equations 


This section presents two special types of first order nonlinear differential equations. 
They are a bridge between y’ = ay and the very general form y’ = f(t,y). These pages 
explain how to solve the two types in between, by ordinary integration. Separable 
equations are the simplest. For exact equations, see formulas (12) and (15). 


Separable Exact 


dy _ g(t) dy _ g(y,t) 


dt f(y) dt f(y, t) 


1. Separable Equations f(y)dy = g(t)dt 
With f(y) on one side and g(t) on the other side, you see the meaning of separable. 
The ordinary way to write this equation would be 


d t 
ay alt) starting from y(0) at time ¢ = 0. (1) 


dt f(y) 
When dy/dt has this separable form, we combine f(y) with dy and g(t) with dt. Those 
functions f and g need to be integrated. The integrals F(y) and G(¢) start at y = y(0) 
and t= 0: 


F(y) = / f(u) du G(t) = i) g(x) dx (2) 
y(0) xz=0 


The dummy variables u and x were chosen because y and t are needed in the upper limits 
of integration. Every author faces this question, to select variables. To show that the 
letters w and x don’t matter, I could change them to Y and T’. 

After integrating f and g, we have implicitly solved the differential equation: 


Solution a = neh integratesto F(y) = G(t). (3) 
To get an explicit solution y = ... we have to solve this equation F'(y) = G(t) to find y. 
Example 1 a eek ydy =tdt. Integrate to find ; (y(t)? — y(0)?) = si. 
Solve this implicit station to find y(t) explicitly : 
dy t t 


Solution t)= 0)? + t?. Th = —————. = -. 
ion y(t) = V/y(0)? + ser Sere Ce ae 
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Example 2 dy/dt=2ty has g(t) =2t dividedby f(y) =1/y. 
Solution Separate 1/y from 2¢ and integrate to get F = Iny — Iny(0) and G = ??: 


y t 
d d 
<p = Beat leads to i “ = Iny—Iny(0) and pera? 

U 

y(0) 0 
In this example, F'(y) = G(t) produces In y = In y(0) + t?. Take exponentials of both sides 
to find the solution y: 
y =e VMet = y(o)e. (4) 

I always check the derivative dy/dt and the starting value y(0): 


“ (yl) et”) = 2t(y(0)e") =2ty ye” =y(0) a t=0. 6) 


Example 3 Our favorite equation a = ay + q is separable when a and gq are constant. 
Move y + # to the left side below dy. Keep adt on the right side. Then integrate 
both sides, and you have solved this equation once more ! 


dy 
or 


=adt gives In(y+ 4) =at+C. (6) 
Take exponentials to find y, and set t = 0 to find C: 
Exponential growth y(t) + T — eel and y(0) + pa ee (7) 
a a 


Substitute for e© in the left equation, to get the answer we know: 
at q at q at 
y(t) +% =e (y(0) +2) and then y(t) = e%*y(0) + “(e%* —1).] 8) 


This answer was the key to Section 1.4. Here the formulas came faster (the first one 
in that box looks attractive). But I like the old way: Follow each input as it grows. 


Example 4 (Logistic equation ) 


y t 


a = ay — by? / AREER) a = / dx (9) 
y(0) t(0) 
The right side is certainly G(t) = t — t(0). I am including t(0) to show how the system 
allows any starting value for t as well as y. We don’t know a perfect starting time for the 
Earth’s population, so we pick a year like t(0) = 2000 and work from there. The key point 
is that two integrals F'(y) and G(t) give the answer. 
Section 1.7 computed those integrals and solved the logistic equation. 
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2. Exact Equations f(y, t)dy = g(y, t)dt 


A separable equation has dy/dt = g(t)/f(y). We wrote this as f(y)dy = g(t)dt. We 
integrated the two sides separately to get F'(y) = G(t). This solved the equation. 


Exact equations are not required to be separable. The functions f and g can depend 
on both variables t and y. The equation does not split into a pure y-integration and a pure 
t-integration. We now have f(y, t) dy = g(y, t) dt. But it sometimes succeeds to integrate 
the left side f(y, t) with respect to y, as if t were a constant which it is not. 


Step 1 Integrate f with respect to y [tw t) dy = F(y,t) + C(t). (10) 


Normally, any constant C' can be added to an integral. The answer stays correct, because the 
derivative of C’ is zero. Here, any function of t can be added to the integral, 
because the y derivative of any C(t) is zero. Now F'(y, t) + C(t) has more flexibility. 


0 
Step 2 (if possible) Choose C(t) so that a Py t) + C(t)) = —g(y, t). (11) 


If that choice of C(t) is possible, our original equation involving g and f is solved: 


d 
Step 3 a issolvedby F(y,t) + C(t) = any constant. (12) 


Before I show when and why this works, here is an example of success. 


. dy 2yt—1 
Example 5 = The equation rita yee 


has g = 2yt — 1 and f = y? — ?#?. 

Step 1 Integrate fdy = (y? — t?)dy to find F(y, t) = sy — yt?. Then ae = —2ty. 

Step 2 Solve equation (11) for C(t). For our particular f and g, this is possible : 
—2ty + es —(2yt —1) gives = land C(t) =t. 


Step 3 The original 4 = ; is solved by F'(y, t) + C(t) = constant: 


Solution from F + C 1, 2 _i 3 
Constant is set by y(0) 3” ea care 34) ; 


To check this answer, take its time derivative implicitly (which means: just do it). 


d 
DW So =, 


Implicit differentiation 1” = a 


This is our equation dy/dt = (2yt — 1)/(y? — t”) as we hoped. Now to explain why. 
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The Exactness Condition 


When is Step 2 possible ? Sometimes there is C(t) to solve equation (11), but usually not. 
To find the condition for exactness, take the y-derivative of both sides in Step 2: 


aa (F(y,t) + C(t) = me (g(yt)). (13) 


The order of a and e. can always be reversed. Certainly cb C(t) = 0 and aud dees Ba 
Oy ot Oy Oy 


The left side of (13) is 2, o, Fyst) = 0 


O Sk FO 
Dy at F(y,t) whichis —f(y,t). (14) 


at Oy at 


Comparing (14) with (13), Step 2 is only possible when our original differential equation 
dy/dt = g/f is exact: 


(6) (6) 
Exact diti ae a) peer fee 15 
xactness condition on f(y, t) ay g(y, t) (15) 


When the equation is exact, Step 2 will produce C(t). The final question is about Step 3. 
Why is F'(y, t) + C(t) = constant for the original differential equation dy/dt = g/f? To 
see this, take the time derivative of F'(y,t) + C(t) using the (implicit) chain rule: 


OF dy OF aC _ 


ef Se 16 
Ss Ob Ol ct) 
OF OF OC 
Step 1 produced — = f. Step 2 produced — + —— = —g. We have success: 
Oy Ot Ot 
d d 
Equation (16) is Fe — g = 0. This is our original problem eS Ze 
dt (ch aS i 
aatis of og 
Example 5 was exact because g = 2yt — 1 and f = y* — t* agree on a = 7 = —2t. 
y 


Example 6 Steps 1, 2, 3 must be possible because this non-separable equation is exact : 


dy i -@. 3G.) Of Og 
= h. — = -—-— = 1. 
eee fae eS ay 


Step 1 Integrate [ fdy = f (t+ y)dy to find F = ty + dy. 
) 
Step 2. Write out RF + ©) =-g=y-—tto find C(t) = —12 
1 iL 
Step 3 The example is solved by F + C = ty + a — 5 = constant = 3 y(0)?. 


To check that solution, find the total time derivative of F' + C by the chain rule: 


dy dy We ee 
t— —-—t=0. Th —= 
wut SE 1S 1S Ht aa 


Z as desired. 
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Final Note : Separable is Exact 


Notice that a separable equation dy/dt = g(t)/ f(y) is always exact: 
(15) is satisfied a f(y) = Ee (t) becomes 0 = 0 
Ge ee ay ee Pig oP 


No problem with integrating [ f( y)dy and [ g( t)it to find F( y)and G(t) = —C(t). 


= REVIEW OF THE KEYIDEAS #® 


d 
1. A separable equation ay 1s) is solved by [ f( ddy=f g(t)dt+ any constant. 


dt f(y) 


2. That solution gives y implicitly. Solve to find y explicitly as a function of t. 


rs) re) 
3. An exact equation = = ate has = = 2 Then F'(y, t) + C(t) = constant. 


4. The solution has F( yt) = [ f( yt)dy for each t, and C(t) = -| (F +a) dt. 


5. The exactness condition in 3 removes y from that integral for C(t) in 4. 


Problem Set 1.8 


1 Finally we can solve the example dy/dt = y? in Section 1.1 of this book. 
y 4 t 
Start from y(0) = 1. Then i = = dt. Notice the limits on y and t. Find y(t). 
y 
1 0 


2 Start the same equation dy/dt = y? from any value y(0). At what time ¢ does the 
solution blow up ? For which starting values y(0) does it never blow up? 


3 Solve dy/dt = a( ty as a separable equation starting from y(0) = 1, by choosing 
f( y)= 1/y. This equation gave the growth factor G(0, t) in Section 1.6. 


4 Solve these separable equations starting from y(0) = 0: 
dy dy 
— t b =_ t™ mr 
(a) a TY (b) at y 
dy _ ree be . : a 
5 Solve — = a(t)y* = as a separable equation starting from y(0) = 1. 


dt 


d 
6 The equation = = y + tis not separable or exact. But it is linear and y = : 
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d 
7 The equation > = ; has the solution y = At for every constant A. Find this solution 


by separating f = 1/y from g = 1/t. Then integrate dy/y = dt/t. Where does the 
constant A come from ? 


8 For which number A is 


d t — 
i) pupusa © an exact equation ? For this A, solve the 
dt  At+by 


equation by finding a suitable function F'(y, t) + C(t). 
9 Find a function y(t) different from y = ¢ that has dy/dt = y?/t?. 


10 These equations are separable after factoring the right hand sides : 


dy dy 
Solve —=e¥t! and — =yt t+. 
olve dt € an a yt +ytt+ 


d d 
11 = These equations are linear and separable: Solve = = (y+ 4) cost and = = Wer 


12 Solve these three separable equations starting from y(0) = 1: 


die 


dy _ 
dt Fe 


Lee 3 dy 
(a) B= —4ty —) ty? ©) (1+ tH) = 4y 


Test the exactness condition 0g/Oy = —Of /Ot and solve Problems 13-14. 


d —3t? — 2y? d 1+ ye 
13 (a) 2  () == = 
dt A4ty + 6y dt 2y + te’¥ 
dy  4t— d 3t? + 2y? 
14 (a) a aaa) (b) LC oA a 
dt t—6y dt Aty + 6y? 
d d 
15 Show that “% = —4 is exact but the same equation oY — _7 ig not exact. 
dt ty dt 2t 


Solve both equations. (This problem suggests that many equations become exact 
when multiplied by an integrating factor.) 


16 ~~ Exactness is really the condition to solve two equations with the same function H(t, y) : 


OH GH. (Os 288g 
ey fh) and ae g(t, y) can be solved if a he 


Take the t derivative of OH /Oy and the y derivative of 0H /0t to show that exactness 
is necessary. It is also sufficient to guarantee that a solution H will exist. 


d 
17. ~—- The linear equation oa = aty+q is not exact or separable. Multiply by the integrating 


factor e~ J **¢t and solve the equation starting from y(0). 
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Second order equations F(t,y,y’,y’’) =0 involve the second derivative y”. 
This reduces to a first order equation for y’ (not y) in two important cases: 


I. When y is missing in F, set y’ = v and y” = v’. Then F(t, v,v’) = 0. 


dv _dvdy_d 


dv 
II. When t is missing in F, set y” = vy. Then F (v. v, ve) = 0. 


18 


19 


20 


21 


22 


dt dydt dy dy 


See the website for reduction of order when one solution y(t) is known. 


(y is missing) Solve these differential equations for v = y’ with v(0) = 1. Then 
solve for y with y(0) = 0. 


(a) y" +y'=0 (b) Qty” —y’ =0. 
Both y and ¢ are missing in y’” = (y’)?. Set v = y’ and go two ways: 
d d 
I. (ymissing) Solve 77 = v" for v(t) and then 7 = v(t) 
with y(0) = 0, y/(0) = 1. 
oye: du 5 dy 
II. (missing) Solve v— =v* for u(y) and then ore u(y) 
with y(0) = 0, y’(0) = 1. 
An autonomous equation y’ = f(y) has no terms that contain ¢ (¢ is missing). 


Explain why every autonomous equation is separable. A non-autonomous equation 
could be separable or not. For a linear equation we usually say LTT (linear time- 
invariant ) when it is autonomous: coefficients are constant, not varying with ¢. 


my" + ky = 0 is a highly important LTI equation. Two solutions are coswt and 
sinwt when w? = k/m. Solve differently by reducing to a first order equation for 
y’ = dy/dt = v with y"” = v du/dy as above: 


d 1 1 
mo + ky = 0 integrates to ye + sky" = constant EF. 
y 


For a mass on a spring, kinetic energy $mv* plus potential energy sky? is a con- 
stant energy &. What is & when y = coswt? What integral solves the separable 
m/(y')? = 2E — ky? ? I would not solve the linear oscillation equation this way. 


my" + ksiny = O is the nonlinear oscillation equation: not so simple. Reduce to a 
first order equation as in Problem 21: 


du ; 1 
mu— +ksiny = 0 integrates to 5m kcosy = constant FE. 


With v = dy/dt what impossible integral is needed for this first order separable 
equation? Actually that integral gives the period of a nonlinear pendulum—this 
integral is extremely important and well studied even if impossible. 
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= CHAPTER 1 NOTES #8 


The great function of calculus is e€, How best to define this exponential function ? 
Section 1.3 constructed y = e¢ from its infinite series 1 + ¢ + zt? + zt? +---. Euler would 
approve ! Taking the derivative of each term brings back e’. This property dy/dt = y is the 
most important tool we have—it is the foundation of our subject. 

I like this approach to e’ for at least two reasons : 


1. Itis based on the derivatives of t and t? and t” : well known. 


2. The Chapter 3 Notes solve nonlinear equations in exactly the same way. 


The limiting step required here is to add up an infinite series. We don’t expect a simple 
answer like 1+ $ + ++ % +--+ = 2. Butthe numbers 1/n! in e’ are (much smaller) than 
these numbers 1/2”. 


This is really the key point, to see that the terms t”/n! approach zero quickly. 
The infinite series 1 + t + t7/2+---+t"/n!+--- converges for every t. 


Proof. Each term t™/n! multiplies the previous term t?—!/(n — 1)! by t/n. At some point 
n = N, that number t/N goes below 5: From this point on, we know that 


t’ tN+1 tN+2 tN 1 1 
a a ot Vis ese th ce Se Beh eal 
MW eee eet is less than mi(l+5+5+ ) 


The right side is t% /N ! times 2. The left side is smaller. The first N terms that come before 
t’ /N'! have no effect on convergence of the series (they just enter the final sum). So the 
series for e' always converges. 

If t is negative, use its absolute value |t| and the proof still succeeds. The series for 
the derivative of e’ is the same as the series for e’. So we know: This series is absolutely 
convergent. We can safely say that y’ = y. 


Four approaches to ef Looking back at my own teaching and writing, I really missed the 
importance of this big step in calculus. Just another function? Not at all. Textbooks offer 
four main ways to construct y = e: 


1. Add all the terms t”/n !. The derivative of each term is the previous t™~'/(n — 1)! 
2. Take the nth power of (1 + t/n) as in compound interest. Let n approach infinity. 
3. The slope of b* is C times b’. Choose e¢ as the value of b that makes C = 1. 


4. Integrate 1/y to construct t = In y. Invert this function to find y = e’. 


I believe that 3 and 4 are too tricky. Explicit constructions are the winners. You want to 
say, “Here is the function.” In method 2 you are working with (1 + t/n)”: not too bad. 
In 1 you see step by step and term by term that dy/dt = y. 


Chapter 2 


Second Order Equations 


2.1 Second Derivatives in Science and Engineering 


Second order equations involve the second derivative d?y/dt?. Often this is shortened to y’’, 
and then the first derivative is y’. In physical problems, y’ can represent velocity v and the 
second derivative y”” = a is acceleration: the rate dy’ /dt that velocity is changing. 

The most important equation in dynamics is Newton’s Second Law F = ma. 
Compare a second order equation to a first order equation, and allow them to be nonlinear: 


First order 4’ = f(t, y) Second order y” = Fit,y,y’) (1) 


The second order equation needs two initial conditions, normally y(0) and y’(0)— 
the initial velocity as well as the initial position. Then the equation tells us y”(0) and 
the movement begins. 

When you press the gas pedal, that produces acceleration. The brake pedal also brings 
acceleration but it is negative (the velocity decreases). The steering wheel produces 
acceleration too! Steering changes the direction of velocity, not the speed. 


Right now we stay with straight line motion and one-dimensional problems : 


ay >0O (speeding up) oy <0 (slowing down) 
aa? speeding u ———— slowin. own). 
oe peeding up a g 


The graph of y(t) bends upwards for y’ > 0 (the right word is convex). Then the 
velocity y’ (slope of the graph) is increasing. The graph bends downwards for y” < 0 
(concave). Figure 2.1 shows the graph of y = sint, when the acceleration is a = 
d?y/dt? = —sin t. The important equation y’’ = —y leads to sint and cost. 


Notice how the velocity dy/dt (slope of the graph) changes sign in between zeros of y. 
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Me 
SS 
y =cost 


y =—sint G is going down and bending up 


Figure 2.1: y’’ > 0 means that velocity y’ (or slope) increases. The curve bends upward. 


The best examples of F' = ma come when the force F' is —ky, a constant k times 
the “position” or “displacement” y(t). This produces the oscillation equation. 


Fundamental equation of mechanics fee +ky=0 (2) 


Think of a mass hanging at the bottom of a spring (Figure 2.2). The top of the spring 
is fixed, and the spring will stretch. Now stretch it a little more (move the mass downward 
by y(0)) and let go. The spring pulls back on the mass. Hooke’s Law says that the force is 
F = —ky, proportional to the stretching distance y. Hooke’s constant is k. 

The mass will oscillate up and down. The oscillation goes on forever, because equation 
(2) does not include any friction (damping term b dy/dt). The oscillation is a perfect cosine, 
with y = cos wt and w = \/k/m, because the second derivative has to produce k/m to 
match y” = —(k/m)y. 


[k [k 
Oscillation at frequency w = ,/— y = y(0) cos ( — ‘) (3) 
m m 


At time t = 0, this shows the extra stretching y(0). The derivative of cos wt has a factor 
w = y/k/m. The second derivative y” has the required w? = k/m, so my” = —ky. 

The movement of one spring and one mass is especially simple. There is only one fre- 
quency w. When we connect N masses by a line of springs there will be N frequencies—then 
Chapter 6 has to study the eigenvalues of N by N matrices. 


0 ye 0) 


spring pushes down 


y” <0. spring pulls up 


Figure 2.2: Larger k = stiffer spring = faster w. Larger m = heavier mass = slower w. 
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Initial Velocity y’(0) 


Second order equations have two initial conditions. The motion starts in an initial position 
y(0), and its initial velocity is y'(0). We need both y(0) and y’(0) to determine the two 
constants c, and cz in the complete solution to my” + ky = 0: 


/k k 
“Simple harmonic motion” y = Cy COs ( 7 ) +c, sin ( = ) ~ (4 


Up to now the motion has started from rest (y'(0) = 0, no initial velocity). Then c, is 
y(0) and cz is zero: only cosines. As soon as we allow an initial velocity, the sine solution 
y = Cp sin wt must be included. But its coefficient ce is not just y’(0). 


y!(0) 


(5) 


d 
At t=O, = =cow coswt matches y’(0) when cg= 


The original solution y = y(0) cos wt matched y(0), with zero velocity at t = 0. The 
new solution y = (y’(0)/w) sin wt has the right initial velocity and it starts from zero. When 
we combine those two solutions, y(t) matches both conditions y(0) and y’(0) : 


‘(0 k 

Unforced oscillation y(t) = y(0)coswt + y'(0) sinwt with w = ,/—. (6) 
w m 

With a trigonometric identity, I can combine those two terms (cosine and sine) into one. 


Cosine with Phase Shift 


We want to rewrite the solution (6) as y(t) = Rceos(wt — a). The amplitude of y(t) 
will be the positive number R. The phase shift or lag in this solution will be the angle a. 
By using the right identity for the cosine of wt — a, we match both coswt and sinwt: 


R cos(wt — a) = Rcos wt cosa+R sin wt sin a. (7) 


This combination of cos wt and sin wt agrees with the solution (6) if 


/ 
0 
Rceosa=y(0) and Rsina= / (8) 
Squaring those equations and adding will produce R? : 
. 2 2 2 2 2 y'(0) ; 
Amplitude R R* = R*(cos* a+ sin* a) = (y(0))* + 3 : (9) 


The ratio of the equations (8) will produce the tangent of a: 
_Rsina _ y'(0) 


~ Reosa wy(0) (10) 


Phase lag a tan @ 


Problem 14 will discuss the angle a we should choose, since different angles can have the 
same tangent. The tangent is the same if a is increased by 7 or any multiple of 7. 

The pure cosine solution that started from y'(0) = 0 has no phase shift: a = 0. 
Then the new form y(t) = R cos (wt — a) is the same as the old form y(0) cos wt. 
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Frequency w or f 


If the time ¢ is measured in seconds, the frequency w is in radians per second. 
Then wt is in radians—it is an angle and cos wt is its cosine. But not everyone thinks 
naturally about radians. Complete cycles are easier to visualize. So frequency is also mea- 
sured in cycles per second. A typical frequency in your home is f = 60 cycles per second. 
One cycle per second is usually shortened to f = 1 Hertz. A complete cycle is 27 radians, 
so f = 60 Hertz is the same frequency as w = 120m radians per second. 

The period is the time T' for one complete cycle. Thus T = 1/f. This is the only page 
where f is a frequency—on all other pages f(t) is the driving function. 


Frequency w= Af Period / 


is 
f 


20 
Ww 


y = Acos wt 


k 
= A cos rex 
m 


£=0 time 


Figure 2.3: Simple harmonic motion y = A cos wt: amplitude A and frequency w. 


Harmonic Motion and Circular Motion 


Harmonic motion is up and down (or side to side). When a point is in circular motion, 
its projections on the x and y axes are in harmonic motion. Those motions are closely 
related, which is why a piston going up and down can produce circular motion of a flywheel. 
The harmonic motion “speeds up in the middle and slows down at the ends” while the 
point moves with constant speed around the circle. 


y = sinwt 


ey 


Figure 2.4: Steady motion around a circle produces cosine and sine motion along the axes. 
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Response Functions 


I want to introduce some important words. The response is the output y(t). Up to now 
the only inputs were the initial values y(0) and y’(0). In this case y(t) would be the initial 
value response (but I have never seen those words). When we only see a few cycles of the 
motion, initial values make a big difference. In the long run, what counts is the response to a 
forcing function like f = coswt. 

Now w is the driving frequency on the right hand side, where the natural frequency 
Wy = Vk/m is decided by the left hand side: w comes from yp, wn comes from yp. 


When the motion is driven by cos wt, a particular solution is y, = Y coswt: 


Forced motion y,(t) 


1 
7 he = 
at frequency w my" +ky=coswt  Yyp(t) = ———, cos wt. (11) 


k — mw? 


To find y,(t), I put Y coswt into my” + ky and the result was (k — mw?)Y coswt. 
This matches the driving function cos wt when Y = 1/(k — mw?). 

The initial conditions are nowhere in equation (11). Those conditions contribute the null 
solution yp, which oscillates at the natural frequency w, = ,/k/m. Then k = mw. 


If I replace k by mw? in the response yp(t), I see w2 — w? in the denominator: 


1 
Response to cos wt yp(t) = ————— (12) 
Vt (tan — OS 


Our equation my” + ky = coswt has no damping term. That will come in Section 2.3. 
It will produce a phase shift a. Damping will also reduce the amplitude |Y(w)|. The 
amplitude is all we are seeing here in Y(w) coswt: 


1 1 
Frequency response Y(w) = =e = m (w2 — w?) : (13) 
n 


The mass and spring, or the inductance and capacitance, decide the natural frequency w,,. 
The response to a driving term cos wt (or e**) is multiplication by the frequency response 
Y (w). The formula changes when w = wn—we will study resonance ! 


With damping in Section 2.3, the frequency response Y(w) will be a complex num- 
ber. We can’t escape complex arithmetic and we don’t want to. The magnitude |Y(w)| 
will give the magnitude response (or amplitude response). The angle @ in the complex 
plane will decide the phase response (then a = —@ because we measure the phase lag). 

The response is Y(w)e** to f(t) = e*”* and the response is g(t) to f(t) = 6(t). 
These show the frequency response Y from equation (13) and the impulse response g from 
equation (15). Ye** and g(t) are the two key solutions to my" + ky = f(t). 
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Impulse Response = Fundamental Solution 


The most important solution to a linear differential equation will be called g(t). In mathemat- 
ics g is the fundamental solution. In engineering g is the impulse response. It is 
a particular solution when the right side f(t) = 6(t) is an impulse (a delta function). 

The same g(t) solves mg” + kg = 0 when the initial velocity is g’(0) = 1/m. 


Fundamental solution mg” + kg = 6(t) with zero initial conditions (14) 


fwat 1 
Null solution also g(t)= "= has g(0) =0 and g(0) =—. (15) 
Mwn m 


To find that null solution, I just put its initial values 0 and 1/m into equation (6). 
The cosine term disappeared because g(0) = 0. 

I will show that those two problems give the same answer. Then this whole chapter will 
show why g(t) is so important. For first order equations y’ = ay + q in Chapter 1, 
the fundamental solution (impulse response, growth factor) was g(t) = e%’. The first two 
names were not used, but you saw how e* dominated that whole chapter. 


I will first explain the response g(t) in physical language. We strike the mass and it starts 
to move. All our force is acting at one instant of time: an impulse. A finite force within one 
moment is impossible for an ordinary function, only possible for a delta function. Remember 
that the integral of 6(t) jumps to 1 when we pass the point t = 0. 

If we integrate mg’ = 6(t), nothing happens before t = 0. In that instant, the integral 
jumps to 1. The integral of the left side mg” is mg’. Then mg’ = 1 instantly at t = 0. 
This gives g'(0) = 1/m. You see that computing with an impulse 5(t) needs some faith. 


The point of g(t) is that it solves the equation for any forcing function f (¢) : 


t 
my” + ky = f (€) has the particular solution y(t) = [ g(t — s) f(s) ds. | (16) 
0 


That was the key formula of Chapter 1, when g(t — s) was e*(¢~*) and the equation was first 
order. Section 2.3 will find g(t) when the differential equation includes damping. 
The coefficients in the equation will stay constant, to allow a neat formula for g(t). 


You may feel uncertain about working with delta functions—a means to an end. 
We will verify this final solution y(t) in three different ways: 


1 Substitute y(t) from (16) directly into the differential equation (Problem 21) 
2 Solve for y(t) by variation of parameters (Section 2.6) 


3 Solve again by using the Laplace transform Y(s) (Section 2.7). 
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= REVIEW OF THE KEY IDEAS #8 


1. my” + ky =0: A mass ona spring oscillates at the natural frequency wy, = Jk/m. 
2. my” + ky = coswt: This driving force produces yp = (coswt)/m (w2 —w?). 
3. There is resonance when w, =w. The solution y, =tsin wt includes a new factor t. 
4. mg" +kg = 6(t) gives g(t) = (sin wnt) /mwyn = null solution with g’(0) = 1/m. 
5. Fundamental solution g: Every driving function f gives y(t) = ‘ g(t — s) f(s) ds. 

0 


6. Frequency : w radians per second or f cycles per second (f Hertz). Period T7=1/f. 


Problem Set 2.1 


1 Find a cosine and a sine that solve d2y/dt? = —9y. This is a second order equation 
sO we expect two constants C' and D (from integrating twice): 


Simple harmonic motion y(t) =C coswt+ D sin wt. Whatis w? 
If the system starts from rest (this means dy/dt = 0 at t = 0), which constant C' or D 
will be zero ? 
2 In Problem 1, which C’ and D will give the starting values y(0) = 0 and y’(0) = 1? 


3 Draw Figure 2.3 to show simple harmonic motion y = A cos (wt — a) with phases 
a= 7/3 anda = —n/2. 


4 Suppose the circle in Figure 2.4 has radius 3 and circular frequency f = 60 Hertz. 
If the moving point starts at the angle —45°, find its z-coordinate A cos (wt — a). The 
phase lag is ~@ = 45°. When does the point first hit the x axis ? 


5 If you drive at 60 miles per hour on a circular track with radius R = 3 miles, what is 
the time T for one complete circuit ? Your circular frequency is f = ____ and _ your 
angular frequency is w = (with what units ?). The period is 7’. 


6 The total energy F in the oscillating spring-mass system is 


oe: : : : : m (dy\? — k 2 
FE = kinetic energy in mass + potential energy in spring = 5 le + 3Y ° 


Compute £' when y = C' cos wt + D sin wt. The energy is constant ! 
7 Another way to show that the total energy F is constant : 


Multiply my” + ky = 0 by y’. Thenintegrate my'y” and kyy’. 
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8 


10 


11 


12 


13 


14 


Chapter 2. Second Order Equations 


A forced oscillation has another term in the equation and A coswt in the solution: 


a 


qe +4y=F coswt has y=C' cos 2t+ D sin 2t+ A cos ut. 


(a) Substitute y into the equation to see how C' and D disappear (they give y,,). Find 
the forced amplitude A in the particular solution y, = A cos wt. 


(b) Incase w = 2 (forcing frequency = natural frequency), what answer does your 
formula give for 4? The solution formula for y breaks down in this case. 


Following Problem 8, write down the complete solution yn + Yp to the equation 


d? 
mae +ky =F cos wt with w 4 Wp = \/k/m (no resonance). 


The answer y has free constants C' and D to match y(0) and y’(0) (A is fixed by F’). 


Suppose Newton’s Law F' = ma has the force F' in the same direction as a: 


my” =+ky including y” = 4y. 
Find two possible choices of s in the exponential solutions y = e**. The solution is 
not sinusoidal and s is real and the oscillations are gone. Now y is unstable. 


Here is a fourth order equation: d*y/dt* = 16y. Find four values of s that give 
exponential solutions y = e*’. You could expect four initial conditions on y: 
y(0) is given along with what three other conditions ? 


To find a particular solution to y” + 9y = e%*, I would look for a multiple 
yp(t) = Ye of the forcing function. What is that number Y ? When does your 
formula give Y = oo? (Resonance needs a new formula for Y .) 


In a particular solution y = Ae‘ to y” + 9y = e™*, what is the amplitude A? 
The formula blows up when the forcing frequency w = what natural frequency ? 


Equation (10) says that the tangent of the phase angle is tana = y’(0)/wy(0). 
First, check that tana is dimensionless when y is in meters and time is in seconds. 
Next, if that ratio is tana = 1, should you choose a = 7/4 ora = 57/4? 
Answer: 


Separately you want Rcosa = y(0) and Rsina = y’(0)/w. 
If those right hand sides are positive, choose the angle a between 0 and 7/2. 
If those right hand sides are negative, add 7 and choose a = 57/4. 


Question: If y(0) > 0 and y’(0) < 0, does a fall between 7/2 and 7 or between 
37/2 and 27? If you plot the vector from (0,0) to (y(0), y’(0)/w), its angle is a. 
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15 


16 


17 


18 


19 
20 


21 


22 


23 


24 


25 


Find a point on the sine curve in Figure 2.1 where y > 0 but v = y’ < 0 and also 
a=y" <0. The curve is sloping down and bending down. 


Find a point where y < 0 but y’ > 0 and y” > 0. The point is below the x-axis but 
the curve is sloping and bending 2 


(a) Solve y” + 100y = 0 starting from y(0) = 1 and y/(0) = 10. (This is yn.) 
(b) Solve y” + 100y = coswt with y(0) = 0 and y/(0) = 0. (This can be yp.) 
Find a particular solution y,» = Rcos(wt — a) to y” + 100y = coswt — sinwt. 


Simple harmonic motion also comes from a linear pendulum (like a grandfather 
clock). At time t, the height is A cos wt. What is the frequency w if the pendulum 
comes back to the start after 1 second? The period does not depend on the amplitude 
(a large clock or a small metronome or the movement in a watch can all have T’ = 1). 


If the phase lag is a, what is the time lag in graphing cos(wt — a)? 


What is the response y(t) to a delayed impulse if my” + ky = 6(t —T)? 


t 
(Good challenge) Show that y = f g(t — s)f(s)ds has my” + ky = f(t). 
0 


1 Why is y’ = Sollt — s) f(s) ds + g(0) f(t) ? Notice the two ?’s in y. 
0 
2 Using (0) = 0, explain why y” = jar —s)f(s)ds+g'(0)f(t). 
3 Now use g/(0) = 1/m and mg” + kg = 0 to confirm my” + ky = f(t). 
With f = 1 (direct current has w = 0) verify that my” + ky = 1 for this y: 


t 


Step response y(t) = i 
0 


ene 1 | 
sin Wn ( 8) 1ds=ypt+Yn equals — — — coswryt. 
MW k ok 


(Recommended) For the equation d?y/dt? = 0 find the null solution. Then for 
d’g/dt? = 6(t) find the fundamental solution (start the null solution with g(0) = 0 
and g’(0) = 1). For y” = f(t) find the particular solution using formula (16). 


For the equation d?y/dt? = e“” find a particular solution y = Y(w)e*”’. Then Y (w) 
is the frequency response. Note the “resonance” when w = 0 with the null solution 
Yn =1 

Find a particular solution Ye? to my” — ky = et. The equation has —ky 
instead of ky. What is the frequency response Y(w)? For which w is Y infinite? 
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2.2 Key Facts About Complex Numbers 


The solutions to differential equations involve real numbers a and imaginary numbers tw. 
They combine into complex numbers s = a+ tw (real plus imaginary). Here are 
three equations and their solutions: 


dy 3 FY _ 5, 5 (w? + a2)y =0 
a =a, Wry = —— ete ( 1 = 
ie dt? # dt? dt af 

y = Cert y= cet 4 cne—iwt y= ce(@ + iw)t 4 coe (4—tw)t 


Chapter 1 solved y' = ay. Section 2.1 solved y” + w2y = 0. Section 2.3 will solve the last 
equation Ay” + By’ + Cy = 0. The balance between real and imaginary (between a and 
iw) will come down to a competition between B? and 4AC. 

This course cannot go forward without complex numbers. You see their rectangular form 
in s = a+ iw (real part and imaginary part). What you must also see is their 
polar form. It is e*, more than s by itself, that demands to be seen in polar form: 


est _ o(a+ iw)t _ pat piwt 
eat gives growth or decay ett gives oscillation and rotation 


The real part a is the rate of growth. The imaginary part w is the frequency of oscilla- 
tion. The addition a + iw turns into the multiplication e®*e*”* because of the rule for ex- 
ponentials. We will surely see exponentials everywhere, because they solve all constant 
coefficient equations: The solution to y’ = sy is y = Ce**. With a forcing function e™’, 
a particular solution to y’ — sy = e* is yp = e™*/(iw — s): a complex function. 

Euler’s formula e“’* = cos wt + isinwt brings back two real functions (cosine and 
sine). Real equations have real solutions. When the forcing function on the right side is 
f = Acoswt+ B sin wt, a good particular solution is y, = M cos wt + N sin wt. 

In this real world, the amplitudes ./ A? + B? and ./M2 + N? are all-important. 
The amplitude is what we see (in light) and hear (in sound) and feel (in vibration). 


The null solutions y,, and the particular solution y, need complex numbers. The form of 
Yn is Ces’. The form of y, is Ye'“*. The complex gain is Y. Notice that the w in s = a+iw 
is the natural frequency in the null solution y,. The w in the right hand side e** is the 
driving frequency in the particular solution yp. 


If Wratural = Wdrivings We will see “resonance” and we will need new formulas. 


Here is the plan for this section. 

1 Multiply complex numbers s; and sz (review). 

2 Use the polar form s = re’? to find the powers s” = re!” (review). 

3 Look especially at the equation s” = 1. It has n roots, all on the unit circle. 


4 Find the exponential e8t and watch it move in the complex plane. 
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Complex Numbers : Rectangular and Polar 


A complex number a + iw has a real part a and an imaginary part w. Two complex numbers 
are easy to add: real part a, + a2, imaginary part w; + we. It is multiplication that looks 
messy in equation (1). The good way is in equation (5). 

Multiplication (a, + iw,) (ag + iwe) = (ayag — wywe) + i(aywe+aqw)). (1) 
Just multiply each part a; and iw by each part ag and iwe. 


Important case s times § (a + iw) (a — iw) = a? +w? : Real number. (2) 


3 = a— iw is the complex conjugate of s = a+ iw. Equation (2) says that ss = |s|?. 


|s| = Va? + w2 is the absolute value or magnitude or modulus of s = a + iw. 


Imaginary axis 


s—a+t+wWw 


Real axis 


Figure 2.5: (i) The rectangular form s = a +iw. (ii) The polar form s = re”? with absolute 
value r = |s| = Va? + w2. The complex conjugate of s is 3 = a—iw=re~. 


The polar form of s uses that distance r = |s| to the center point (0,0). The real numbers 
a and w (rectangular) are connected to r and @ (polar) by 


At that moment you see Euler’s Formula e’” = cos@ + i sin 6. I could regard this as the 
complex definition of the exponential. Or I can separate the infinite series for e”° into its real 
part (the series for cos #) and imaginary part (the series for sin 6). 

Euler’s Formula is used all the time, to express e” in terms of cos@ and sin 6. It is 
useful to go the other way, and express the cosine and sine in terms of e”? and e~”: 


ei9 4 @-i0 ei? — ei? 
Cosines from exponentials cos 0 = aes sin 0 = oe (4) 
i 


The sine comes from subtraction. Cancel cos 0 to get 27 sin 6. We need to divide by 27. 
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The Polar Form of s" and 1/s 

The polar form is perfect for multiplication and for powers s”. We just multiply absolute 
values of s; and s2, and add their angles. Multiply rir2 and add 6; + 02. 

Multiplication s1s2 (ri ei) (ry ei) = rare (91 + 82) (5) 


Powers of s = re”? aie oye = pr -ind (6) 


If n = 2, we are multiplying re’? times re” to get r2e’?9. (0 is added to 0.) If n = —1, we 
are dividing. The rectangular form of 1/(a + iw) matches the polar form of 1/(re*’) : 
1 1 a— iW a—w 1 Leal. 1 


Serer | tee a SS g=-a-.e (7) 
at+tw atiw a-—iw a? + w2 re® r et r 


That magnitude is r = |a + iw| = Va? + w?, Equation (7) says that 1/s equals 3/|s|?. 
In solving y’ — ay = e*”*, what we meet is y = e**/(iw — a): 


1 


iw—a 


ie 
Gain G and Phase a iw —a=re'® aes er SiGerls (8) 


I prefer this polar form. When s = re’, the absolute value of 1/s is 1/r. The angle is —0. 


Examples The polar form of 1 + i is /2e*”/* : absolute value r = /1+ 1 = V2. 
The polar form of its conjugate 1 — i is V/2e—7*/4. 
The polar form of its reciprocal 1/(1 + i) is (1/-W/2)e77"/4. 


Notice that we can add 27 to the angle 0. That brings us around a circle and back to the 
same point. Then e”? = e#(9+27) and e~i7/4 = @7™1/4, 


The Unit Circle 
The polar form brings out the importance of the unit circle in the complex plane. That circle 
contains all complex numbers with absolute value r = |s| = 1. The numbers on the unit 


circle are exactly s = e*® = cos 0 +3 sin 0. 


Since r = 1, every r” is also 1. All powers like s? and s~! stay on the unit circle. 
The angles in Figure 2.6 become 20 and —@. The nth power s” has angle nd. 


Here is a nice application of complex numbers to trigonometry. The “double angle” 
formulas for cos 20 and sin 26 are not so easy to remember. The “triple angle” formulas for 
cos 36 and sin 36 are even harder. But all these formulas come from one simple fact : 


(e29)” — ein@ (cos@ +7 sin 0)" = cosné +i sin nd. (9) 
If you take n = 2, you are squaring e’” = cos@ + i sin 0 to get e’?°: 
(cosO +i sin 6)? = cos” 6 — sin? 6 + 2i cos 6 sin 0 = cos 20 +i sin 20. (10) 


The real part cos? 6 — sin? @ is cos 20. The imaginary part 2 sin @cos @ is sin 26. 
For triple angles, multiply again by cos 0 + 7 sin @ (in Problem 4). 
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,— gin /4 _ Le steags  h Le 
s=e =cos — +2 sIn — = 
4 4 


Figure 2.6: The number s = e”® has s? = e’?° and s~! = e~“, all on the circle with r = 1. 
Here 9 = 45° which is 1/4 radians. So 20 = 90° and s? = 7. Then s® = 1. 


The Equation s” = 1 


There are two numbers with s? = 1 (they are s = 1 and —1). There are four numbers with 
s* = 1 (they are 1 and —1 and i and —i). Those four numbers are equally spaced around the 
unit circle. This is the pattern for every equation s” = 1 : n numbers equally spaced around 
the unit circle, starting with s = 1. The Fundamental Theorem of Algebra says that nth 
degree equations have n (possibly complex) solutions. The equation s” = 1 is no exception, 
and all its roots are on the unit circle. 


= e2nni/n ani _ 7 


nrootsof s* = 1 gett ig Ss AT eg =e 
These are the powers s, s?,..., 8” of the special complex number s = e2"*/". This number 
s = e?7/8 is the first of the 8 solutions to s® = 1, going around the circle in Figure 2.6. 


Here is a remarkable fact about the solutions to s” = 1. Those n numbers add to zero. 
In Figure 2.6, you can see that s°> = —s and s° = —s? and s’ = —s°? and s® = —s?. 
The roots pair off. Each pair adds to zero. So the 8 roots add to zero. 

For n = 3 or 5 or 7, this pairing off will not work. The three solutions to s? = 1 are at 
120° angles. (s and s? are e?7*/3 and e47*/3, at angles 120° and 240°. Then comes 360°.) 


To show that those three numbers add to zero, I will factor s? — 1 = 0: 
0=s?-—1=(s—1)(s?+8+1) leadsto s?+5+1=0. (11) 


The n numbers on the unit circle go into the Fourier matrix. They are the key to the 
overwhelming success of the Fast Fourier Transform in Section 8.2. 
wt ast 


The Exponentials e and e 


We use complex numbers to solve differential equations. For dy/dt = ay the solution 
y = Ce is real. But second order equations can bring oscillations e““' together with 
growth/decay from e?¢. Now y has sines and cosines, or complex exponentials. 
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Our goal is to follow those pieces of the complete solution to Ay” + By’ + Cy = 0. 
Where does the point e°**”)* travel in the complex plane ? The next section connects 
a and w to the numbers A, B, Cand solves the differential equation. 


The best way to track the path of e(¢+*)* is to separate a from iw. The path of e** is a 
circle. The factor e® turns the circle into a spiral. 


Rule for exponentials elatiw)t _ pat iwt (13) 


This is the polar form! The factor e® is the absolute value r. The angle wt is the phase 
angle 0. As the time t increases, we follow those two parts : 


Absolute value ec grows witht ifa >0 e%* decays if a < 0 


Phase angle e*“t goes around the unit circle when ¢ increases by 27 /w 


The real part a decides stability. This is just like Chapter 1. We will see that damping 
produces a < 0 which is stability. In that case B > 0 in y’ + By’ + Cy = 0. 


This section is about the iw part of the exponent s. That produces the e“* part of the 
solution y = e**. The pure oscillations in Section 2.1 came from my” + ky = 0 with 
no damping. They had only this e*“? part (along with e~*“*, which travels in the opposite 
direction around the unit circle). The frequency is w = \/k/m. 


Watch ce”? as it goes around the circle. If you follow its horizontal motion (its shadow 
on the x axis) you will see cos wt. If you follow its height on the y axis, you will see sinwt. 
The circle is complete when wt = 27. So the period is T = 27/w. 


Figure 2.7: y” + w?y = 0: One complex solution e*”’ produces two real solutions. 


When we multiply e*”’ by e, their product e** gives a spiral. The spiral goes 
in to the center if a is negative. The spiral goes outward a > 0. You are seeing the benefit of 
complex numbers, to merge oscillation and decay into one function. The real functions are 
e* cos wt and e™ sin wt. The complex function is e% e* = e%*, 


Question What will be the time T and the crossing point X , when the spiral completes 
one loop and returns to the positive x-axis ? 


Answer The time T will be 27/w, to complete each loop of the spiral. The crossing 
point on the x-axis will be X = e®”. At time 2T, the crossing will be at _X?. 
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11 


Problem Set 2.2 


Mark the numbers s; = 2 +72 and sg = 1 — 27 as points in the complex plane. (The 
plane has a real axis and an imaginary axis.) Then mark the sum s; + 82 and the 
difference s; — 89. 


Multiply s; = 2 + 7 times s2 = 1 — 27. Check absolute values: |s1||s2| = |s1s9|. 


Find the real and imaginary parts of 1/(2 + 7). Multiply by (2 — 7)/(2 —1): 


1 2-i 2-i 
Dea, PRG (Pale 


Triple angles Multiply equation (10) by another e*® = cos @ + isin @ to find 
formulas for cos 30 and sin 30. 


Addition formulas Multiply e? = cos @ + i sin 6 times e’? = cos d + isin 
to get e+). Its real part is cos (0 + ¢) = cos Ocos ¢ — sin @ sin ¢. What is its 
imaginary part sin (0 + ¢) ? 


Find the real part and the imaginary part of each cube root of 1. Show directly that the 
three roots add to zero, as equation (11) predicts. 


The three cube roots of 1 are z and z? and 1, when z = e27/3_ What are the three 
cube roots of 8 and the three cube roots of i? (The angle for 7 is 90° or 7/2, so 
the angle for one of its cube roots will be . The roots are spaced by 120°.) 


(a) The number i is equal to e**/2. Then its i*4 power i¢ comes out equal to 
areal number, using the fact that (e*)’ = e%’. What is that real number 2* ? 


(b) e'”/? is also equal to e°"/?. Increasing the angle by 2m does not 
change e”? — it comes around a full circle and back to i. Then 7* has another 
real value (e5**/?)* = e~5*/?, What are all the possible values of i* ? 


The numbers s = 3+ 7% and 5 = 3 — 1 are complex conjugates. Find their sum 
s+ = —B and their product (s)(5) = C. Then show that s? + Bs + C = 0 and also 
3? + Bs + C = 0. Those numbers s and 3 are the two roots of the quadratic equation 
a?+Br+C=0. 


The numbers s = a+ tw and § = a — ww are complex conjugates. Find their sum 
s +5 = —B and their product (s)(5) = C. Then show that s? + Bs + C = 0. The 
two solutions of x? + Bx + C = Oare s ands. 

(a) Find the numbers (1 + 7)* and (1 + 1). 

(b) Find the polar form re’? of (1 + iv/3)/(V3 + 1). 
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The number z = e27*/" solves z” = 1. The number Z = e?7*/2" solves Z2” = 1. 
How is z related to Z? (This plays a big part in the Fast Fourier Transform.) 


(a) If you know e” and e~®, how can you find sin 6? 
(b) Find all angles @ with e*® = —1, and all angles ¢ with e*? =i. 
Locate all these points on one complex plane: 


1 
2+1 


(a) 2+i (b) (24%)? (ec) (di 2/2464] 


Find the absolute values r = |z| of these four numbers. If 6 is the angle for 6 + 8:7, 
what are the angles for these four numbers ? 


1 


=, . <— . 2 
(a) 6-8 (b) (6 — 87) (c) meme 


(d) 8i+6 


What are the real and imaginary parts of e* + and e@ + ™ 9 


(a) If |s| = 2 and |z| = 3, what are the absolute values of sz and s/z? 


(b) Find upper and lower bounds in L < |s + z| < U. When does |s + 2| =U? 


(a) Where is the product (sin 9 +7 cos 6)(cos @ + 7% sin 6) in the complex plane? 
(b) Find the absolute value | S| and the polar angle ¢ for S = sin 8 +7 cos 0. 


This is my favorite problem, because S combines cos @ and sin @ in a new way. 
To find ¢, you could plot S or add angles in the multiplication of part (a). 


Draw the spirals e(l—)t and e(?— 2%)t. Do those follow the same curves? Do 
they go clockwise or anticlockwise? When the first one reaches the negative x-axis, 
what is the time 7’? What point has the second one reached at that time ? 


The solution to d?y/dt? = —y is y = cos t if the initial conditions are y(0) = 
and y/(0) = . The solution is y = sin t when y(0) = and 
y'(0) = . Write each of those solutions in the form c; e”” + c2e~“, to see 


that real solutions can come from complex c; and c2. 


Suppose y(t) = et elt solves y” + By’ + Cy = 0. What are B and C'? If this 
equation is solved by y = e®”*, what are B and C’? 


From the multiplication eA --iB — el(A ce B) find the “subtraction formulas” 
for cos (A — B) and sin (A — B). 


(a) If r and R are the absolute values of s and S, show that rR is the absolute value 
of s.S. (Hint: Polar form !) 


(b) If 5 and S are the complex conjugates of s and S, show that 35’ is the complex 
conjugate of sS. (Polar form !) 
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Suppose a complex number s solves a real equation s? + As? + Bs +C = 0 
(with A, B, C’' real). Why does the complex conjugate $ also solve this equation ? 
“Complex solutions to real equations come in conjugate pairs s and 8.” 


(a) If two complex numbers add to s + S' = 6 and multiply to sS = 10, what are 
s and S? (They are complex conjugates.) 

(b) If two numbers add to s + S = 6 and multiply to sS = —16, what are s and 
S? (Now they are real.) 


If two numbers s and S' add to s + .S = —B and multiply to s.S = C, show that s and 
S' solve the quadratic equation s? + Bs + C = 0. 


Find three solutions to s? = —8i and plot the three points in the complex plane. What 
is the sum of the three solutions ? 


(a) For which complex numbers s = a + iw does e* approach 0 as t + co? 
Those numbers s fill which “half—plane” in the complex plane? 


(b) For which complex numbers s = a + iw does s” approach 0 as n —> co? 
Those numbers s fill which part of the complex plane? Not a half-plane! 
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2.3 Constant Coefficients A, B,C 


Section 2.1 presented the important equation my” + ky = 0. That is a special case of 
this second order constant coefficient equation. We still need two initial conditions: 


d*y dy 


Aaa -- an +Cy=0 1] starting from y(0) and y’(0). (1) 


The coefficients A, B,C can be any constants. For pure oscillation, A was the mass m 
and C' was the spring constant k, both positive. B > 0 introduces damping. In this 
section the numbers A,B,C can be positive or negative or zero, so we may have 
exponential growth or decay or (damped) oscillation. With zero on the right hand side of 
equation (1), this section is finding null solutions y,, : unforced motion. 

Our first job is to solve equation (1). When the coefficients are constant, we always 
look for exponentials e*t. That number s can be positive (y will grow) or negative 
(y decays) or pure imaginary (y oscillates). If s is a complex number a + iw, then its 
real part a controls growth or decay. The imaginary part w controls oscillation. 

We will see the solutions clearly, because A, B,C’ are constant. The right choice of 
y(0) and y‘(0) will produce the growth factor g(t) that multiplies all inputs to give yp. 


The key step is to find the rate s in y = e%*. A second order equation normally has 
two possible rates s; and sg. To find those numbers, substitute y = e** into equation (1): 


As?e$t + B se8t + Cest = 0. (2) 


The factor e*’ can be divided out because it is never zero. This leaves an all-important 
equation to determine s: 


Characteristic equation As? + Bs+C=0O. (3) 


This is an ordinary quadratic equation for s. Every quadratic has two roots s; and sg. 
They could be real, they could be complex, they could be equal. The two roots come from 
the quadratic formula : 


—B+J/B?—-4AC —B— VB? —4AC 
Two values for s si. = $2 = . (4) 
2A 2A 
Those roots add up to s1 + s2 = —B/A. The roots multiply to give sis. = C/A. 


The question of real roots or complex roots is highly important, and it has a direct answer: 
Real roots B? > 4A4C  Equalroots B* =4AC Complex roots B? < 4AC 


When B? — 4AC is positive, its square root is real. Then we have real roots s; > so. 

When B? — 4AC = 0, its square root is zero and s; = 82 (borderline case: equal roots). 

When B? — 4AC is negative, its square root is imaginary. The quadratic formula (4) 

produces two complex numbers a + iw and a — iw with the same real part a = —B/2A. 
Let me look at all three cases, starting with examples. 
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Two Real Roots, One Double Root, No Real Roots 


A picture will show you how B? — 4AC decides real vs. complex. The three parabolas 
in Figure 2.8 have C' = 0 and C = 1 andC = 2. By increasing C we lift the parabolas. 
The critical value is C' = 1, when the middle parabola barely touches y = 0 at s = 1. 
C = 1 gives a double root and in this case B? = 4AC = 4. 


C 
4 


2 


y= s?-—2s42; s=1+i 


no real roots Vee eee te Remar eee bs s=1,1 


equal roots 
y=s?—2s+0= s(s—2) s=0,2 


8,=0 


real roots 


Figure 2.8: Lowest curve: Two roots for C = 0. Middle curve: Double root for C = 1. 
Highest curve misses the axis: No real roots for C = 2 — complex roots a + iw. 


All three parabolas have A = 1 and B = —2 and B? = 4. The test that compares 
B? to 4AC is comparing 4 to 4C. This shows again that C = 1 is at the critical 
borderline B? = 4AC. Any value C > 1 will lift the parabola above the y = 0 axis. 
The roots of s? — 2s + C = 0 will be complex, and y” — 2y’ + Cy = 0 will give damped 
oscillation. 

For C = 2 that equation becomes (s — 1)? = —1. Thens —1 =iors—1 = ~—i. 
The two complex roots are s = 1 +7 and s = | — 7. The quadratic formula (4) agrees. 


Real Roots sj > s9 
Example1 "+ 3y’+2y=Owithy = e** — Substitute A, B,C = 1,3, 2to find s. 
As? + Bs +C = 8? + 3s +2=0 factors into (s + 1)(s + 2) =0. (5) 


The roots are both negative: s; = —1 and sg = —2. Those numbers come from the quadratic 
formula (4) and they come faster from the factors in (5): The first factor s + 1 is 
zero when s; = —1, and s + 2 = 0 when sz = —2. Damping — negative s — stability. 
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The complete solution to our linear differential equation is any combination of the 
two pure exponential solutions. These are null solutions (homogeneous solutions). 


Null solutions y(t) = c,e%* + cge%2* = cye* + coe~** (6) 


The numbers c; and cz are chosen to make y(0) and y’(0) correct when t = 0: 
Set t=0 y(0) =c1 + c2 and y'(0) = —c, — 2c2. (7) 
Those two equations safely determine c; = 2y(0) + y’(0) and co = —y(0) — y’/(0): 
Final solution y(t) = cxe—* + cpe—?* = y(0)(2e7* — e~) + y' (0) (e~* — e**). 


Example 2 Solve y” — 3y’ + 2y = 0. The coefficient B has changed from 3 to —3. 
Solution Substitute y = e** as before. Negative damping gives positive s. 


aoe iy a (s—1)G@—2)=0 8; =2 and s9=1. 


The complete solution is now y(t) = c1e?* + cze*. Exponential growth = instability. 


Equal Roots sj = s9 


The roots of As? + Bs + C will be equal when B? = 4AC. When you factor the quadratic, 
you see (s — s;)? times A. The factor s — s; appears twice: s = 8, is now a double root. 
Our e** method has a problem when it finds one double root s = s;. After y = e?%, 
what is a second solution to our second order equation ? 
We will show that y = te*%1* is also a solution when sz = 31. 


Example 3 Solve y’’ — 2y’ + y = 0. Those coefficients 1, —2, 1 have B? = 4AC. 
Solution Substitute y = e** as usual. The root s = 1 is repeated: two equal roots. 


s?-28+1=0 (s-1=0 s=1=82 


With that root, y = e* solves the equation: easy to check. A second solution is needed ! We 
now confirm that y = te®* = te® is also a solution of y” — 2y’ + y =0: 


y’ = (tet) = te’ + e* y" — 2y' + y = (te + 2e’) — 2(te* + e*) + (te’) =0 


A double root of As? + Bs + C=O mustbes; = —B/2A. 
Then y; = e°!? and also y2 = te’ solve Ay” + By’ + Cy = 0. 


Proof With simple roots, the lowest parabola in Figure 2.8 cuts across Y = 0. 
The middle parabola Y = (s — 1)? is tangent to the Y = 0 axis at the double root 1, 1. 
“The graph touches twice at the same point s = s,.” The root is s; = sg = —B/2A. 


2.3. Constant Coefficients A, B,C 93 


Height zero 


ay 
Y = As? + Bs; +C=0 andalso —=2As;+B=0. (8) 
Slope zero 


ds 


Toconfirm that Ay” + By’ + Cy is zero for y = te®*®, look at y and y’ and y : 


yo =m syte* + 6%" = gy + e7t 
y =e ee S efit =s ( + syt 876 = 22 sit 
1y 1 i(siy +e + s1e°" = sty + 2s1e 


Substituting y” and y’ and y into Ay” + By’ + Cy, we get 0 + 0 from equation (8): 
A(s?y + 2s,e%"*) + B(syy + e%"*) + Cy = (As? + Bs + C)y + (2As: + B)e*’ =0+0. 


The quadratic formula agrees with s; = —B/2A = 89, because B? — 4AC = 0. 
The square root disappears, leaving —B/2A for both solutions. Here is the simplest example 
of a double root s; = S2 and a factor t in the second solution. 


Example 4 Solve y’’ = 0. The coefficients 1,0,0 have B? = 4AC. 


Solution Substitute y = e®* to find s2e*’ = 0 and s? = 0. The double root is s = 0. 
The usual solution y = e** = e°* = 1 does have y” = 0. We need a second solution. 


The rule y = te still applies when s = 0. That second solution is y = te® = t. 
We know this already : y = 1 and y = t solve y"” = 0. 


Higher Order Equations 


Problem 18 will extend these ideas to n™ order equations (still constant coefficients!). 
Substitute y = e** to get an n™ degree polynomial in s. Now there are n roots. If those 
roots $1, $9,...,8m are all different, they give n independent solutions y = e*’. But 
if a root s; is repeated two or three or m times, we need m different solutions for s = s,: 


Multiplicity m = The m solutions are y = e*1*, y= te t,..., y= t™-L est, (9) 
A simple example would be the equation y’”” = 0. Substituting y = e* leads to s+ = 0. 
This equation has four zero roots (multiplicity m = 4). The four solutions predicted by 
equation (9) are y = 1,t,t?,t?. No surprise that those all satisfy the equation y’”” = 0: 
their fourth derivatives are zero. 

Here is a fourth order equation that produces two real roots and two complex roots : 


an 


yay =0 y=e* leadsto st -1=0 (10) 


The four roots are s; = 1 and sz = —1 and s3 = 2 and s4 = —17. Then the complete 
solution toy” = y is y = cye* + coe? + cze"* + cae~™. 


94 Chapter 2. Second Order Equations 


Complex Roots sj = a + iw and s9 = a — w 


The formula for the roots of a quadratic includes the square root of B? — 4AC. 
When that number is negative, the square root is imaginary. The example y” + y = 0 
has A, B, C equal to 1,0, 1,s0 B? -4AC = —4. The quadratic is As*+ Bs +C = s? +1. 


The solutions to s? + 1 = O are s = iand s = —i. The solutions to s2 +4 = Oare s = 2i 
and s = —2i. The oscillations from y” + 4y = 0 can be written in two ways: 
B = 0: No damping y = cye™* + ege72* = Cy cos 2t + Co sin 2t. (11) 


The real part of s is zero when B = 0: pure oscillation. 
Now bring in damping: y’ + y’ + y = 0. For the solutions to s* +. s +1 = 0, 
go to the quadratic formula: A,B,C are 1,1,1 and B? — 4AC is —3: 


-1+ /-3 1) ava Ly a3 
s?+s+1=0 8, = ———=--4+—i sg=---— —i. 
Ved : 2 21 3 eae eS 
The two complex roots s, and s2 have the same real part a = —1/2. Their imaginary parts 


w and —w have opposite signs (as in /3/2 and —/3/2). Those are the plus and 
minus signs on the square root of B? — 4AC. Assuming that A, B, C are real numbers, 
the two roots of As? + Bs + C = 0 are complex conjugates. If 1 place s; and sz onto 
the complex plane, they are symmetric mirror images across the real axis. 


imaginary axis 


8, = 
The roots are 
a+ iw anda — iw. 
Their product is 
a?#+w2=C/A=1. 
$2 => 


The conjugate of s = a+ iw is $=a—iw. The magnitude is |s| = Va? + w?. 

In the example with a = —1/2 andw = 3/2, the magnitude is exactly |s| = 1. 
This is because (—1/2)? + (/3/2)? = 1. The circle in the picture has radius 1. The unit 
circle is extremely important to recognize. The complex numbers on that circle 
have the form s = cos 6 + i sin 0, because (cosine)? + (sine)? = 1. The angle @ is measured 
from the positive real axis. In the figure this angle is 120° or 7/3. 

The points on the unit circle are given by Euler’s Formula e*? = cos 6 + isin 0. 


We can switch between the complex form for y(t) and its equivalent real form. 
Complex y(t) = e“' (ce! + ege**") ~~ Real y(t) = e** (C1 coswt + C2 sinwt) 


Euler’s formula for e“”? and e~“” shows that Cy) = cy + cg and Co = ic, — ico. 
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With those key facts about complex numbers a + 2w, we come back to the example 
s* +s +1 = Oand the differential equation it comes from: 


d? d . 


t — -(a—iw)t 


This number e(@ + “)¢ is nor on the unit circle. The real part a = —1/2 is responsible. 
When a = 0, e*”* goes around the circle. When a < 0, e(¢+*)* spirals to zero: damped. 
The magnitude of e* is 1, but e®* grows large or small depending on the sign of a: 


Growth a > 0 Magnitude jeletiw)t| Sat dee ee 
Decay a < 0 Magnitude |e(e+™)t) — eat _, g 


That real part is always a = —B/2A. Every equation Ay” + By' + Cy = 0 will have 
damping and decay if A and B are positive. Here is an example with B = —1: 


Negative damping — growth y’ —y'+y=0 s?—s+1=0., 


That changes a to +4. The roots a + iw are now coming from s? — s+ 1 =0: 


1 3 ——> 
$s) =atiw= ae + ae, has magnitude |s,| = a? +w? = 1. 
This point s; is on the unit circle, because |s;| = 1. Its real part a is +3, sO s; is on 
the right side (not left side) of the imaginary axis. The angle in s; = e”? changes to 0 = 
60°. Now s, and sg are on the right half of the unit circle (the unstable half: e%* grows). 


1 ; 
“Anti-damping” B=—1 Growthrate a= > Magnitude je] = e* ell? 


In most physical problems we expect positive damping B > 0 and negative growth rate 
a < 0. Then the differential equation is stable and its null solutions die out as t — oo. 


Overdamping versus Underdamping 


This section emphasizes the difference between B? > 4AC and B? < 4AC. That is the 
difference between real roots and complex roots. This is a difference you can see—with your 
own eyes and not just with formulas. For damping coefficients B = 1,2,3 
the solutions to y” + By’ + y = 0 will approach zero in different ways (Figure 2.9). 

At this time I want to vary the damping B instead of the stiffness C’. 
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y(0) 


overdamped 


crtitically 


Figure 2.9: y(t) goes directly to zero (overdamped) or it oscillates (underdamped). 


The four damping possibilities match the four possibilities for roots of As? + Bs + C = 0. 
This table brings the whole section together : 


Overdamping B* > 4AC Real roots e*1t and e*t 
Critical damping B* = 4AC Double root e*1t and te®1t 
Underdamping B? < 4AC Complex roots e cos wt, e'sin wt 
No damping B=0 Imaginary roots cos wt and sin wt 


Figure 2.9 shows how the graph crosses zero and comes back, for underdamping. 
This is like a child’s swing that is settling to zero (so the child can get off the swing). 
When B = 0 we have a = 0 and imaginary roots iw and pure spring—mass oscillation. 

Figure 2.10 shows four parabolas all with A = C' = 1. The damping coefficients are 
B = 0,1,2,3. When B = 8 the damping is strong and s? — 3s + 1 = 0 has real roots. 
When B = 2 the damping is critical and s* — 2s + 1 = 0 has a double root s = 1,1. 
When B = 1 the damping is weak and the roots are complex. The solutions y = e* coswt 
and y = e* sinwt oscillate as the e* term goes to zero. When B = 0 there is no decay. 


\ \ t/// i 
\ \ ccna“ 
\ \ B<2 z) 


\ \ estes /] y=s?+1s+1 s=(-1+ V3i)/2 
x Buz / 


hi mest Wed 
aah A 7s 0 s y=s°+2s4+1 s=-—1,-1 


‘ z y=s?+3s+1 s=(-34V5)/2 


/ 
\ overdamped / 
see 


y=s?+0s+1 s=i,-i 


Figure 2.10: As B increases, the lowest point on the parabola moves left and down. 
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Fundamental Solution = Growth Factor = Impulse Response 


One special choice of initial conditions is all-important: g(0) = O and g’(0) = 1/A. 
The letter g instead of y picks out this fundamental solution. This is a null solution 
with the jump start g’(0). It is also a particular solution to Ag” + Bg’ + Cg = 4(t). 
This fundamental solution from the delta function will lead us to all solutions. 

Review: The roots of As? + Bs +C = Oare s) and s2. They give two solutions e*?* 
and e*2* to the null equation, if s; 4 s2. We want the combination g = c,e%!* + c2e5?* that 
matches g(0) = 0 and g’(0) = 1/A. Choose the right c and c2: 


g(0) = Cie C2 > 0 Multiply by $2 89C, + 82C2 = 0 
g/(0) = s1¢e1 + 82c2 = 1/A Then subtract (s1 — S82)c1 = 1/A 
esit Pe eS2t 1 
The fundamental solution g(t) = —————— has cy = ——————~ = — cp (12) 
A(s1 = S2) A(si at S82) 


Nodamping For the oscillation equation my” + ky = 0, the roots of ms? + k = 0 are 
imaginary: 51 = 1,/k/m = iw and s2 = —i,/k/m = —iw. Then the fundamental solution 
has a simple form with A = m: 


Sit __ Set = plwt_~-wmt 94 sinwt sinwt 


a 


(s1 — 82) m2iw) Q2imw Aw ~ 


This is exactly the impulse response from Section 2.1. Clearly g(0) = 0 and g/(0) = 1/A. 
Underdamping Now s; = a+ iw and s2 = a — iw. There is decay froma = —B/2A 
and oscillation from w. Soon we will write p for B/2A and wg for w. 
lat iw)t _ (a —iw)t sin wt sin wat 


t) 2 See SE erent =e Pt : 14 
g(t) AQiw) Agee Gt A ae 


Critical damping Now B? = 4AC and the roots are equal: s; = s2 = —B/2A. 
The second solution to the differential equation (after e*’) is g(t) = te®**. Dividing by A, 
this is exactly the solution that has g(0) = 0 and g’(0) = 1/A. 


tesit te—Bt/2A 
00) a aes 
Overdamping When B? > 4AC, the roots s; and sg are real. Formula (12) is best. 


The real purpose of g(t) is to solve Ay” + By’ + Cy = f(t) with any right side f(t). 
This impulse response g is the fundamental solution that gives all other solutions : 


(15) 


t 
Solution for any f(t) Yp(t) = lic — s)f(s)ds (16) 
0 


The step response to f(t) = 1 is y, = integral of g(t). This comes in Section 2.5. 
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Delta Function and Impulse Response 


In this section g(t) is a null solution with initial velocity g‘(0) = 1/A. The same g(t) is a 
particular solution in the next section, with initial velocity zero but driven by an impulse 
f(t) = 6(t). Only a delta function could make this possible: g(t) is yn for one problem 
and yp for another problem. 

The informal explanation is to integrate all terms in Ag” + Bg’ + Cg = O(t). 
On the right side the integral is 1. The integration is over a very short interval 0 to A. 
On the left side the integral of Ag” is Ag’(A), plus terms of order A going to 0. 
To match 1 on the right side, the impulse response g(t) starts immediately with g’ = 1/A. 


Example 5 Thebestexampleis g/’’(t) = 6(t) with ramp function g(t) = t. 


The derivative of the ramp is a step function. You see the sudden jump to g’ = 1. 

The ramp g(t) = t agrees with formula (15) in this case with A = 1 and B = C = 0. 

The null equation g” = 0 starting from g(0) = 0 and g‘(0) = 1 is solved by g(t) = t. 
Everything is zero for t < 0. Then we see the ramp g(t) and the step g’(t) and g” = 6(t). 
This is the limiting case of equation (12) when B and C’ and s; and s2 approach zero. 


A personal note Thank you for accepting the slightly illegal input 5(¢) and its response g(t). 
I could have left those out of the book. But I couldn’t have lived with myself. They are truly 
the key to theory and applications. 


Shift Invariance from Constant Coefficients 


For a constant coefficient equation, the growth from time s to time t is exactly equal to 
the growth from 0 to t — s. The problem is shift invariant. We can start the time interval 
anywhere we want. For all intervals of the same length, we will see the same growth 
factor g(t — s). This is the growth of input 


t 
Inputs f(s) at times s Total output y(t) = [ate — s) f(s) ds. lg) 
0 


This is exactly like the main formula y(t) = f et-8)q(s) ds in Chapter 1. There the 
growth factor was g(t) = e®. The equation dy/dt — ay = q(t) had constant a. 


Shift invariance is lost if any of the coefficients A, B, C’ change with time. The growth 
factor becomes g(s,t), depending on the specific start s and end t (not just on the 
elapsed time t — s). In this harder case the solution is y(t) = f g(s,t) f(s) ds. 

For a first order equation, Section 1.6 found g(s,t). But second order equations 
with time-varying coefficients are usually impossible to solve with familiar functions. 
We often have no formula for g(s,t)—the response at time ¢ to an impulse at time s. 
Shift invariance (constant coefficients) is the key to successful solution formulas. 
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Better Formulas for s1 and s9 


The solutions to As? + Bs + C = O are s; and sg. The formula for those two roots 
involves B? — 4AC. We have seen that B? > 4AC is very different from B? < 4AC. 
Overdamping leads to real roots, underdamping leads to complex roots and oscillations. 
The formulas are so important that the whole world of science and engineering has tried 
to make them simpler. 

Here is the natural way to start. Assign letters to the ratios B/2A and C/A. We know 
C/A as w?2. This is k/m in mechanics. It gives the “natural frequency” with no damping. 
For the ratio B/2A I will use the letter p. The main point is to simplify s and s2: 


—-B+JAB* —4AAC 
ees Pt /p— on (18) 


A big improvement! Two symbols instead of three, which makes sense because we can 
divide As? + Bs +C = 0 by A. By introducing p = B/2A we remove the 2 and the 4 
in equation (18). 

The comparison of B? to 4AC is now the comparison of p* to w2. When p? > w?, 
the roots are real (overdamping). When p* — w?2 is negative, s; and s2 will be complex. 


We have oscillation at a damped frequency wg, lower than the natural frequency w,, : 
w2 = w? — p? 8, and sy = —p+ti,/w2—p? = -ptiwa (19) 


The Damping Ratio Z 


The presentation could stop there. We see that the ratio of p to w,, is highly important. This 
fact suggests one final step, that we take now: Z = p/w, is the damping ratio Z. In 
engineering this ratio is called zeta (the Greek letter is ¢). To make it easier to write, allow 
me to use Z (capital zeta in Greek = capital Z in Roman.) Then we can replace p by Zw. 
Now the formula s = —p + iwg uses W, and Z: 


DERE 5 — —Zw,, + twa = —ZuontinonVI- Zz? ae 


Tr 


The damped w? is w2 — p? = w2(1 — Z?), Its square root wa is the damped frequency. The 
null solutions are y,(t) = e~2%"*(c, cos wat + c2 sin wqt). 
Underdamping is Z < 1, critical damping is Z = 1, and overdamping is Z > 1. 
The key points become clear because this ratio Z is dimensionless : 
D Bye . 1B, b 


Damping ratio Z = — = = 


= ————. 21 
Wn VJC/A V4AC V4mk 
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If time is measured in minutes instead of seconds, the numbers A, B,C’ are changed by 
602 and 60 and 1. The ratio of B to V4AC is not changed: a factor of 60 for both. 
This confirms that B? — 4AC is a suitable quantity to appear in the quadratic formula, 
because B? and 4AC have the same units. 

One last point is a good approximation when Z is small. The square root of 1 — Z? is 
close to 1 — $Z 2. This comes from calculus (linear approximation using the tangent line). 
The good way to confirm it is to square both sides. Then 74/4 is very small. 


i 1 
V1 —-Z?21- me becomes 1.72 B= 7? mae (22) 


The good measure of damping is the ratio Z = B//4AC. This key dimensionless 
number decides everything : 


Z> 1  B? > 4AC and real roots: Overdamping and no oscillation. 
Z <1 B? < 4AC and complex roots: Underdamping and slow oscillation. 
Z=1  B? =4AC anda double root —B/2A: critical damping. 


Here is a curious fact. For very large B, the roots are approximately s; = —1/B and 
82 = —B. That root sq gives fast decay. But the actual decay of y(t) is controlled by s1, 
which approaches zero! So increasing B actually slows down this dominant decay mode. 


Note that many authors refer to s; and s2 as poles. They are poles of the transfer function 
Y(s) = 1/(As? + Bs + C), where Y becomes 1/0. We will come back to 
transfer functions! Some authors emphasize time constants rather than exponents. The 
exponential e~?* has time constant 7 = 1/p. In that time 7, e~”* decays by a factor e. 


= REVIEW OF THE KEYIDEAS & 


1. The equation Ay” + By’ + Cy = 0 is solved by y = e®’ when As? + Bs + C = 0. 
2. The roots sj, sq are real if B? > 4AC, equal if B? = 4 AC, complex if B? < 4AC. 
3. Negative real roots give stability and overdamping: y(t) = c1e*1* + c2e%?* — 0. 


4. Equal roots s = —B/2A when B? = 4AC. Change the second solution to yz = te**. 


5. Complex roots a + iw give underdamped oscillations: e*’(C, coswt + C2 sinwt). 


6. The initial values g(0) = 0 and g’(0) = 1/A give g(t) = (e*** — e2*) /A(si — 82). 
The same g(t) solves Ag” + Bg’ + Cg = 6(t). This is the fundamental solution. 


7. s1 and s2 become —p + iwg with p = B/2A andw? = w? — p?. With damping ratio 
Z = B/VA4AC < 1, those complex s; and sy are —Zwn + twnV1 — Z?. 
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Problem Set 2.3 


1 Substitute y = e%* and solve the characteristic equation for s: 

(a) 2y”+8y'+6y=0 (b) yl” —2y"+y=0. 
2 Substitute y = e** and solve the characteristic equation for s = a + iw: 

(a) y"4+2y’+5y=0 (b+) yy +42y"4+y=0 
3. Which second order equation is solved by y = cye~** + coe"? Or y = te’? 
4 Which second order equation has solutions y = cye~?* cos 3t + coe~** sin 3t? 
5 Which numbers B give (under) (critical) (over) damping in 4y” + By’ + 16y =0? 
6 If you want oscillation from my” + by’ + ky = 0, then b must stay below | 


Problems 7-16 are about the equation As? + Bs +C = Oand the roots s1, so. 


7 The roots s; and sz satisfy s; + s2 = —2p = —B/2A and s;82 = w2 = C/A. Show 
this two ways: 


(a) Start from As? + Bs +C = A(s—1)(s — 82). Multiply to see s1s2 and s; +82. 


(b) Start from sj = —p + iwg, s2 = —p — iwg 


8 Find s and y at the bottom point of the graph of y = As? + Bs+C. At that minimum 
point s = Smin and y = Ymin, the slope is dy/ds = 0. 


9 The parabolas in Figure 2.10 show how the graph of y = As? + Bs + C is raised 
by increasing B. Using Problem 8, show that the bottom point of the graph moves left 
(change in Spin) and down (change in ymin) when B is increased by AB. 


10 (recommended) Draw a picture to show the paths of s; and sy when s? + Bs +1 =0 
and the damping increases from B = 0 to B = oo. At B = 0, the roots are on the 
__ axis. As B increases, the roots travel on a circle (why ?). At B = 2, the 
roots meet on the real axis. For B > 2 the roots separate to approach 0 and —oo. 
Why is their product 8182 always equal to 1? 


11. (this too if possible) Draw the paths of s; and sz when s? + 2s + k = 0 and the 
stiffness increases from k = 0 to k = oo. When k = 0, the roots are 
At k = 1, the roots meet ats = ____. For k — ow the two roots travel up/down 
ona____ in the complex plane. Why is their sum s, + S2 always equal to — 2? 


12  Ifapolynomial P(s) has a double root at s = s,, then (s — s1) is a double factor and 
P(s) = (s — s1)?Q(s). Certainly P = 0 at s = s,. Show that also dP/ds = 0 
at s = s1. Use the product rule to find dP/ds. 


13 Show that y” = 2ay’ — (a? + w”)y leads to s = a + iw. Solve y” — 2y’ + 10y = 0. 


102 


14 


15 


16 


17 


18 


19 
20 


21 


22 


Chapter 2. Second Order Equations 


The undamped natural frequency is wn, = \/k/m. The two roots of ms? + k = Oare 
s = +iw, (pure imaginary). With p = b/2m, the roots of ms? + bs + k = 0 are 
81,82 = —p+ \/p? — w?. The coefficient p = b/2m has the units of 1/time. 


Solve s? + 0.1s +1 = Oand s? + 10s+ 1 = 0 with numbers correct to two decimals. 


With large overdamping p>> wr», the square root J ‘p* —w2 is close to 
p — w2/2p. Show that the roots of ms? + bs + k are 8; © —w?/2p = (small) 
and sy % —2p = —b/m (large). 

With small underdamping p << wn, the square root of p” — w2 is approximately 
iwn — ip*/2w,. Square that to come close to p? — w2. Then the frequency for small 
underdamping is reduced to wg & Wn — p*/2Wn. 


Here is an 8th order equation with eight choices for solutions y = e*° : 


dé : Peer 

ae =y becomes s®e% =e% and s® = 1: Eight roots in Figure 2.6. 
Find two solutions e** that don’t oscillate (s is real). Find two solutions that only 
oscillate (s is imaginary). Find two that spiral in to zero and two that spiral out. 


d™ d 
Anse oe Aa + Ay = O leads to Ans™ +++» + Ais + Ap = 0. 
The n roots s1,...,8 produce n solutions y(t) = e** (if those roots are distinct). 


Write down n equations for the constants c; to cn iny = cy est! 4... + Enednt by 
matching the n initial conditions for y(0), y’(0), ..., D”~1y(0). 


Find two solutions to d?°15y /dt?°l> = dy/dt. Describe all solutions to s?°!° = s. 


The solution to y” = 1 starting from y(0) = y/(0) = 0 is y(t) = t?/2. The 
fundamental solution to g” = 6(t) is g(t) = t by Example 5. Does the integral 
J 9(t—s)f(s)ds = f(t — s)ds from 0 to t give the correct solution y = t?/2? 


The solution to y” + y = 1 starting from y(0) = y’/(0) = Ois y = 1 — cost. The 
solution to g” + g = 6(t) is g(t) = sint by equation (13) with w = 1 and A = 
1. Show that 1 — cost agrees with the integral [ g(t — s)f(s)ds = f sin(t — s)ds. 


The step function H(t) = 1 for t > 0 is the integral of the delta function. So the step 
response r(t) is the integral of the impulse response. This fact must also come 
from our basic solution formula: 


t 
Ar" + Br'+Cr=1 with r(0)=r'(0)=0 has r(t) = ic —s)1ds 
0 


Change t — s to 7 and change ds to —dr to confirm that r(t) = [ g(7)dr. 
0 


Section 2.5 will find two good formulas for the step response r(t). 
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2.4 Forced Oscillations and Exponential Response 


The equation Ay” + By’+Cy = Ohas no forcing term. Its right side is zero. This equation is 
homogeneous. The null solution yn(t) = ce®!® + cye82 is controlled by 
the initial conditions y(0) and y’(0). If those are zero, the system never moves. 
The equation Ay” + By’ + Cy = f(t) is forced or driven by that new term f(t). 
Previously y = 0 was a possible solution. Now we can expect a particular solution yp. 
This section is about driving forces f = e%* and et and coswt and sinwt. For 
f = e%, the next example will show you how to find Yp- 


Exponential Driving Force 


In this example, one particular solution yp(t) = Ye*’ is a multiple of the input eft, 


All we have to do is find that number Y, by substituting into the differential equation. 


Example 1 Solve y” + 5y’ + 6y = e**. One particular solution will be yp = Ye**. 


When Y e** is substituted into the equation, all terms contain ett: 


y” + 5y’ + 6y = 16Ye** + 20Ve* + 6Ye* = e**. (1) 


The left side is 42 Ye**. This matches the right side e#* when Y = 1/42: 
Particular y, AD Ves =e pives AY = Yp(t) = e**/42 HG) 


The complete solution has the form y = yp, + Yn. There are two arbitrary constants 
c; and cp in the solution y,(t) to the homogeneous equation (the null equation with 
forcing term = zero). Look for the two exponents s; and s that solve the quadratic 
equation As? + Bs + C = 0. We know how to find the null solution y,. 


Substitute y = e** into y” + 5y’ + 6y = 0. Cancel e* to find s? + 5s + 6 = 0. 


That quadratic factors into (s + 2)(s + 3). This is zero for s = —2 and s = —3. 
Those roots of the “characteristic equation” are the exponents in the null solution y,,(t). 
This is the homogeneous solution = complementary solution = transient solution, 
which decays to zero at £ = oo when there is damping. 


Null solution Yn(t) = cre~** + coe**. 
The final step is to choose c; and cz so that y = Yp + Yn = qye*’ + yn Satisfies the initial 
conditions. This will complete Example 1, by getting it right at ¢ = 0. 
1 
Initial position y(0) = DB + c+ © 
4 
Initial velocity y'(0) = oe. 2c, — 3c2 


Those two equations tell us the correct values c; and cz, when y(0) and y’(0) are given. 


104 Chapter 2. Second Order Equations 


Exponential Response Formula 


We can turn that example into a formula for Y that almost always succeeds. Put y = Ye** 
into the equation. Each derivative multiplies y by s. So Ay” + By’ + Cy will multiply 
y = Ye* by the number As? + Bs + C. Divide by that number to see Y : 


1 
Ay” + By! + Cy =e* issolvedby y = Ye*t = _______ e** | @ 
le oe Bs tapas m’-*e-ae¢Re+C0° ae 


That fraction Y is called the transfer function. It ‘transfers’ the exponential input e* 
into the exponential output y, = Ye**. The formula allows s to be an imaginary iw 
or any complex number s = a + iw. Use the exponent s that is in the driving force f: 


1 


Ay" + By! + Cy = e™# Jeads to Yp(t) wt | (4) 


~ AGiw)? + Bliw) +6 * 


Example 2.) y” + y’ = e* has s = iw =i. Substitute y = Ye” and solve for Y: 


; : . 1 : 
o2 tat ° UF att it 2 5 t 
i“ Ye" +iYe" =e e+i)Y=1 t) = —— e”. (5) 
(i +1) Yp(t) : 
Example 3 (important) Solve y’’ + y’ = cost. The cosine is the real part of e. 
Warning: The solution will not have the form y = Y cost. The derivative —Y sint 


would appear in the differential equation, with no other term to cancel it. The correct 
solution involves both cost and sint. Damping from y’ delays the cosine. 


Here y,p(t) in Example 3 is the real part of yp(t) in Example 2. Please use this idea: 
The real part of the input e*”* produces the real part of the output Ye*”*. 


1 1 ee 
Step 1 Write Y ( 5) = La 


Spee Steg Ne las 2 


: -l-i a 
Step 2. The real partof Ye = ae (cost +isint) is yp = as cost + sint). 


The exponential response formulas are (3) and (4). The only time they fail is when the 
denominator in the fraction is zero. The formula would then contain 1/0. That happens 
when the exponent s in the driving term equals one of the exponents s; and s2 in the null 
solution yy, = cie*"* + coe%?". This is called resonance: s = sj or 5 = S82. 

You see that we cannot allow y, to be included among the null solutions y,. If the right 
side is f A 0 for yp, it cannot also be f = 0 as required by y,. We will see that the correct 
form for a resonant solution y, includes an extra factor t in Y te*’. 

A special effort goes into the oscillating case s = iw. Null solutions y, = e% 
depend only on A, B, C. That part comes from the roots of As? + Bs +C = 0. The new part 
is the forced oscillation y,(t), a particular solution that is driven by coswt. 
It will be y,(t) = Gcos(wt — a) with a phase shift a and a gain G in the amplitude. 
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Equations of Order NV and Order 2 


I would like to outline the work ahead, because this section is important. It started with a 
specific example y” +5y’+6y = e**. Those numbers 1, 5, 6, 4 changed to letters A, B, C, s. 
We solved the second order equation Ay” + By’ +Cy = e®. The solution Y e*" introduced 
the transfer function Y = 1/(As? + Bs + C). 

Now we have two ways to go, both essential. One is to see the same formula y = Yes 
for every constant coefficient equation. Y comes from the “exponential response formula” 
because Ye* is the response to the exponential f(t) = e®*, One formula covers almost all 
equations (but resonance is special and Y has to change). 

The other crucial step is to focus on second order equations driven by f = et 
Yes, this is covered by the formula. But if we are serious, we won’t stop with Y (iw). 
We truly need the rectangular and polar forms of that complex number: 


1 


——.—____——~ = M-iN =Ge-™. 6 
A(iw)? + B(iw) +C y ; ve 


Y (iw) = 


M, N,G,a will be in equations (23) to (27). The solution driven by f = coswt becomes 
y = M coswt + N sin wt. Damped motion (B > 0) can be compared with undamped. 
And the big applications in Section 2.5 need the better notation using Z: 


Cc i B 
Natural Fy esta Damping Z= Damped w? = w3(1 — Z?) 
frequency ” 4 __ ratio V4AC frequency tt 
(7) 


The damping ratio Z and those frequencies w,, and wq give meaning to the solution y(t). 


Complete Solution yp + yn 


Let me summarize the case of undamped forced oscillation (driving force F'coswt). 
If B = 0, the complete solution to Ay” + Cy = F coswt is one particular solution yp 
plus any null solution yy at the natural frequency wn = \/C/A. Notice the two w’s: 


Particular solution (w) F ae ee: 
= —_ t Wry Wn 
Unforced solution (w7) Cie = ae EE hee (8) 


To repeat: Any time we have a linear equation Ly = f, the complete solution has the 
form y = Yp+Yn- The particular solution solves Ly, = f. The null solution solves Ly, = 0. 
Linearity of L guarantees that y = yp + yn Solves Ly = f: 


Complete solution y = Yp + Yn If Ly, = f and Ly, = 0 then Ly = f. (9) 


This book emphasizes linear equations. You will see y, + yn again, always with the rule of 
linearity Ly = Lyp + Lyn. This applies to linear differential equations and matrix 
equations. In differential equations, L is called a linear operator. 


106 Chapter 2. Second Order Equations 


a™ d 
Linear operator Ly = Ay” + By'+Cy or Ly = An ad fase AS + Aoy 
For an operator L, the inputs y and the outputs Ly are functions. 
Every solution to Ly = f has the form yy + Yn. Suppose we start with one 


particular solution yp. If y is any other solution, then L(y — y,) = 0: 


Yn = Y — Yp isa null solution Lyn = Ly —Lyp = f-—f =9. (10) 


Example 4 Suppose the linear equation is just Ly = x1; — x2 = 1: one equation 
in two unknowns 2 and x. The solutions are vectors y = (#1, %2). The right side f = 1 
is not zero. The bold line in Figure 2.11 is the graph of all solutions. 


Yn 


Y=YUpt Yn 


Particular solution: Ly, = 1 


Null solution line: Ly, = 0 Complete solution line: Ly = 1 


@1—% =0 1 —#2=1 


Figure 2.11: Complete solution = one particular solution + all null solutions. 


Every point on that bold line is a particular solution to 7; — x2 = 1. We marked only 
one yp. Null solutions lie on a parallel line x; — 22 = O through the center (0, 0). 


Example 5 Second order equations Ay!” + By! + Cy = e* or e”* have complete 
solutions y = Yp + Yn. The particular solution y, = Ye* is a multiple of e%. 
The null solutions are y, = c,e*"* + cge*?*. If s2 = 81, replace e°2" by te”. 


Example 6 The complete solution to the impressive equation 5y = 10 is y = 2. This 
is our only choice for the particular solution, y, = 2. The null solutions solve 5y,, = 0, 
and the only possibility is y, = 0. The one and only solution is y = Yp + Yn = 2+ 0. 


That seems boring, when y,, = 0 is the only null solution. But this is what we want (and 
usually get) for matrix equations. If A is an invertible matrix, the only solution to Ay = b is 
Y = Yp = A~'D. Then the only null solution to Ayn = 0 is yn = 0. 


Higher Order Equations 


Up to this moment, third derivatives have not been seen. They don’t arise often in 
physical problems. But exponential solutions Ye*' and Ye’ still appear. The one essential 
requirement is that the equation must have constant coefficients. 
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Soe a ay = re) 
quation of order NEN arr oy = 


When f = 0, the best solutions of the null equation are still exponentials y, = e*%'. 


Substitute e** into the equation to find N possible exponents 81, $2,..., SN. 
f =Oand y,, = e* (Ans +---+ Ais + Ao) e* = 0. (12) 


The exponents s in y, are the N roots of that polynomial. So we (usually) have N 
independent solutions e*!*,...,e°%". All their combinations are still solutions. If the 
polynomial in (12) happens to have a double root at s, our two solutions are e*’ and te*’. 


Example 7 Solve the third order equation = y/”” + 2y”” + y’ = e*. 
Solution To find the null solutions y,,, substitute y,, = e*’ with right hand side zero: 


s?°+2s?1+5=0 s(s?+2s+1)=0 s(s +1)? =0. 


t t 


The exponents are s = 0, —1, —1. The null solutions are c,e°* and c9e” and cste— 


(the extra t comes from the double root). A particular solution y, is Y e** (since 3 is not 
one of the exponents 0 and —1 in y,). Substitute Ye** to find Y = 1/48: 


27 Ye* + 18Ve* + 3Ve* = ce and 48Y = 1 and yp, = e*%*/48. 


The transfer functionis Y (s) =1/(s* + 2s? + s). For e** put s=3. Then Y = 1/48. 


Here is the plan for this section on constant coefficient equations with forced oscillations. 
1 Find the exponential response y(t) = Y (s)e* to the driving function f(t) = e°*. 
2 Adjust that formula when Y (s) = oo because of resonance. 
3 Solve the real equation Ay” + By’ + Cy = cos wt to see the effect of damping. 


This is the key example for applications: y is the real part of Y(s)e** when s = iw. 
The solution in equation (23) is y(t) = M coswt + N sinwt = Gcos(wt — a). 


Exponential Response Function = Transfer Function 


This book concentrates on first and second order equations. When the coefficients are con- 
stant and the right side is an exponential, we have solved three important problems: 


First order yo — ay=et Yp =e“ /(c — a) 


Oscillation my” + ky = et Yp = alg (te mw?) 
Second order Ay"+ By’ + Cy = et tip Seo" (Ast Bs +) 


It is natural (natural to a mathematician) to try to solve all constant coefficient equations 
of all orders by one formula. We can almost do it, but resonance gets in the way. 
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Let me write D for each derivative d/dt. Then D? is d? /dt?. All our equations involve 
powers of D, and equations of order N involve D“. Here N = 2. 


Polynomial P(D) Ay” + By'+Cy=(AD?4+ BD+4+C)y=P(D)y. (13) 
The null solutions and the particular solution all come from this polynomial P(D). 


Find N null solutions y,=e* As? + Bs +C=0 is exactly P(s) = 0 (14) 
Find a particular y,=Y e“* P(D)y=e% gives the number Y =1/P(c) (15) 


The value Y of the transfer function gives the exponential response yp = eet /P(c). 


Please understand: In the null solutions, s has N specific values s,,..., 5,7. Those are 
the roots of the Nth degree characteristic equation P(s) = 0. In the particular solution 
e“' /P(c), the specific value s = c is the exponent in the right hand side f = e“. 

The exponents c and s are completely allowed to be imaginary or complex. 


ct 


P(D)y = et Y=VYptyn= pag ee eer (16) 
¢ 


That fraction Y = 1/P(c) “transfers” the input f = e% into the output y = Ye°'. You 
often see it as 1/P(s) with the variable s. It is sometimes called the system function. 

There is only one exception to this simple and beautiful exponential response formula. 
The forcing exponent c might be one of the exponents s,,...,s, in the null solution. 
In this case P(c) is zero. We cannot divide by P(c) when it is zero. 


Exception —_If P(c) = 0 then y = e™/P(c) cannot solve P(D)y = e™. 


P(c) = 0 is the exceptional case of resonance. The formula e/P(c) has to change. 


Resonance 


We may be pushing a swing at its natural frequency. Then c = iw, = i,/k/m. The 
polynomial P(D) from my” + ky is mD? +k, and we have P(c) = 0 at this natural 
frequency. Here is the exponential response formula adjusted for resonance. 


t 


Resonant response If P(c) =O then y, = Po) e 
c 


ct (17) 


That extra factor ¢ enters the solution when P(c) = 0. We replace 1/P(c) by t/P’(c). 
This succeeds unless there is “double resonance” and P’(c) is also zero. Then the formula 
moves on to the second derivative of P, and y,(t) = t?e%/P"(c). 

The odds against double resonance are pretty high. The point is that the equation 
P(D)y = e“ has a neat solution in terms of the polynomial P : usually y = e®*/ P(c). 
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I can explain that resonant solution y = te“/P’(c) when P(c) = 0 and P’(c) # 0. 
We have seen this happen in Section 1.5 for the first order equation y’ — ay = e@. 
That equation has P(D) = D — aand P(c) = c — a and resonance when c = a: 


ect <F eat 
y’ —ay=e has the very particular solution yyp = 
c—a 
mn i h derivative of top ter 
S$ c approaches a approaches — = 
PP Lee derivative of bottom 1 


That is l’Hopital’s Rule! The only unusual thing is that we have c in place of x, and 
c-derivatives in place of x-derivatives. The very particular solution is the one starting from 
Yp = 0 att = 0. The resonant solution te® fits our formula te®/P’(c) because 
c = aand P(c) = c—aand P’(c) = 1. 

When the equation has order N, the polynomial P has degree N. Suppose the exponent 
c is close to a—which is one of the exponents s1,...,sy in the null solution. Then 
P(a) = 0 and e® is a null solution and e“/ P(c) is one particular solution : 


ect a eat 


P(c) — P(a) 


To emphasize: c close to a is fine. But c = a is not fine. Formula (16) changes at c = a: 


A very particular solution to P(D)y =e“ is Yop = (18) 


at 


t 
Resonance If c=a _ then!’Hopital’s limit in(16) is yp = Pa) (19) 
a 


Take the c-derivatives of et — e% and P(c) — P(a) at c = a, to get te*and P’(a). 


Summary The transfer function is Y(s) = 1/P(s). It has “poles” at the N roots of 
P(s) = 0. Those are the exponents in the null solutions y(t). The particular solution 
Yp = Ye has the same exponent c as the driving term f = e“. The transfer function 
Y (c) = 1/P(c) decides the amplitude of y,(t). If cis a pole of Y, we have resonance. 


Example 8 The 4th degree equation D*ty = d*y/dt* =1 has 4-way resonance. 


What are the null solutions to y’”” = 0? By trying y = e*’ we get s* = 0. This has 
all four roots at s = 0. Then one null solution is y = e®, which is y = 1. The other null 
solutions have factors t, t?, t? because of the four-way zero. Altogether: 


The null solutions to y’”” = O have the form y,(t) = ci + cot + cst? + cat . 


Now find a particular solution to y”” = e°. For most exponents c we get yp = e“'/c?*. 
This is exactly e°¢/P(c). But c = 0 gives quadruple resonance: c* = 0 has a 4-way root. 
A quadruple |’ Hopital rule gives the fourth derivative P’’” and the very particular solution to 
y""" = | that you knew before taking this course and seeing this book: 

tteot t4 


= ==) ,0¢ = = — ae = 
y= 1 =e has c=a=0 and P=s ue(t) = Bim) ~ 3a" 
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Real Second Order Equations with Damping 


Now we focus on the key equation: second order. The left side is Ay” + By’ + Cy. 
The transfer function is Y(s) = 1/(As* + Bs + C). When the right side is f(t) = e**, 
the exponent is s = iw. When A, B,C are nonzero, we won’t have resonance: 


No resonance A(iw)? + B(iw) + C = (C — Aw”) + i(Bw) 4 0. 


We know that the response to f(t) = e“”* is yp(t) = Y (iw)e™®. This is a perfect example, 
except that those functions are not real. 

In applications to real life (and this equation has many), we want f(t) = coswt. 
We must solve this problem. You will say, just solve for e“’* and e~™*, and take half 
of each solution. Even faster than that, solve for e*“* and take the real part of y,(t). 
Or you could stay entirely real and look for a solution y(t) = M cos wt + N sin wt. 


All those ideas will succeed. They all give the same answer (in different forms). 
The best form has to bring out the most important number in the answer y(t). That 
number is the amplitude G of the forced oscillation. So first place goes to the 
polar form y(t) = G cos(wt — a), because this shows the gain G. 

The null solutions decay because the solutions s; and sg to As? + Bs +C = O have 
negative real parts —B/2A. The particular solution G cos(wt — a) does not decay, because 
it is driven by a forcing function f = cos wt that never stops. 

The next pages will find G and a. This is algebra put to good use. We are working with 
letters A,B,C that represent physical quantities. In Section 2.5 they will be 
mass-damping-stiffness or inductance-resistance-inverse capacitance. Those are not the 
only possible examples! Biology and chemistry and management and the economics of 
a whole country also see damped oscillations. I hope you will find those models. 


Damped Oscillations in Rectangular Form 


I will start with the rectangular form y(t) = Mcoswt + Nsinwt. It is not as useful as 
the polar form, but it is easier to compute. Substitute this y(t) into the differential equation 
Ay" + By’ + Cy = cos wt. Match the cosine terms and the sine terms : 


Cosines on both sides —Aw*M + BuwN+CM =1 (20) 


Sines on the left side —Aw*?N —BuM+CN =0 (21) 


To solve for M, multiply equation (20) by C — Aw?. Then multiply equation (21) by 
Bw and subtract from (20). The coefficient of N will be zero. So N is eliminated and 
we have an equation for / alone. MM is multiplied by the important number D : 


C — Aw? times (20) 


—_ 2)2 2 = = = 2 
minus Bw times (21) [(C — Aw?)? + (Bw)?]M = DM =C —- Aw*. (22) 
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We divide by D to find M = (C — Aw?)/D. Then equation (21) tells us N = Bw/D. 
And equation (27) will tell us that M? + N? = 1/D. 


Real solution y, is _ C— Aw? _ BwM _ Bw 
M coswt+N sinwt an D N= C= Aga ~ Dp e) 


Let me say right away: The complex number Y (iw) is just MJ — iN. This calcula- 
tion will connect real to complex and rectangular to polar. When I multiply and divide 
by Y(—iw), you will see that the denominator of Y (iw) is D = (C — Aw?)? + (Bw)?: 

a: (C — Aw?) -iBw _ (C — Aw?) —iBw 


(C— Aw?) +iBw ~ (C— Auw®) —iBu > =M—iN. (24) 


Y = M —iN is exactly what we want and need. The input f = cos wt is the real part 
of ce”, so the output y is the real part of Ye’**. That real part is the rectangular form 
y= Mcoswt + Nsinwt: 


Re (Ye*") = Re [(M —iN)(coswt + isinwt)] = Mcoswt+ Nsinwt (25) 


Damped Oscillations in Polar Form 


The solution we want is the real part of Y(iw)e*“’*. Equation (25) computed that solution 
in its rectangular form. To compute y(t) in polar form, the first step (almost the only step) 
is to put Y (tw) in polar form. This number is the complex gain: 


N 
Complex gain Y(iw) = M —iN = Ge” with G = and tan a = ‘Vi (26) 


1 
VD 
That amplitude G is simply called the “gain”. It is the most important quantity in all these 
pages of calculations. The input cos wt had amplitude 1, the output y(t) has amplitude G. 
Of course that output is not y = Gcoswt! Damping produces a phase lag a. At 
the same time damping reduces the amplitude of the output. 


The undamped amplitude |Y| = 1/|C — Aw?| is reduced to G = 1/VD: 


ae 2)2 2\ 1/2 1/2 
(C — Aw”) ¢ (Bw) - (33) var! (27) 


D? D2 D2} ~~ ae 


I will collect all these beautiful (?) important (!) formulas after one example. 


G=VMP+N = ( 


Example9 Solve y” + y’ + 2y = cos t in rectangular form and also in polar form. 


Solution The equationhas A = 1,B =1,C = 2, andw = 1. We are finding a particular 
solution. Let me use the formulas directly and then comment briefly. The numbers give 
C — Aw? = land Bw =1,8s0 D= 12 +1? =2. 
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Therefore the solution has G = \/1/2and M = N = $ andtana=1landa=7/4: 


Rectangular y(t) = M coswt +N sin wt = 3 (cos t+ sin t) 


Polar y(t) = Re (Ge~**e") = G cos (wt — a) = A cos (t — 4). 


For this example we verify directly that polar = rectangular: 


Gcos(t—= 7 (cost “ +sint sin =) (eas evaind) 
cos = = —= | cost cos — sint sin — } = —(COos sin Fs 
4 /2 4 4 2 


The rectangular form has simpler numbers. But the polar form has the most important 
number G = 1/2. That gain G is less than the undamped gain |Y| by a factor cos a. 


1 1 1 


Undamped \Y | = IC — Aw?] cl Damped G= VD = Va = cosa. 


Undamped versus Damped 


The undamped equation Ay” + Cy = coswt has B = 0 and Y = 1/(C — Aw?). 
Compare that amplitude of y(t) = Ycoswt from Section 2.1 with the harder problem 
we just solved. The comparison lets you see how the damping contributes Bs = Biw in 
the transfer function that multiplies the input e*”’. Damping causes a phase lag a. 
Damping also reduces the amplitude to G = Y cosa. Here are the key formulas: 


Undamped Damped 
Equation Ay" + Cy = coswt Ay" + By' + Cy = coswt 
Solution y =Y coswt y = Gcos(wt — a) 
: 1 1 
Magnitude \Y | = \C — Aw?| G= VD = Ycosa 
Phase | ne ee 
ase lag zero ana = 7 = Gay? 


When the driving function is F'coswt, the solutions include that extra factor F’. 
When the driving function is sinwt, that is the same as cos (wt _ z), So the solutions 
have ¢ = 7/2 as an additional phase lag: y = Gcos(wt — a — 7/2) = Gsin(wt — a). 

When the driving function is Acoswt + Bsinwt, that equals Rcos(wt — ¢). This is 
the sinusoidal identity from Section 1.5. Then the solution is RG cos(wt — a — @). 


This is the particular solution y, that oscillates with the same frequency w as the input. 


Let me show why the gain is reduced to G = Y cosa from its undamped value 
\Y| = 1/|C — Aw?|. We know from (27) thatG = /M2+ N2 = 1/V/D. And we 
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know from (23) that YM = 1/D: 


YM 1/D 
Damped gain Y cos 9 = ——————— _ = BrLLe =G. (28) 


JM?2+N2 1//D 


Better Notation 


A good plan is to divide my” + by’ + ky = kF(t) by the mass m, for several reasons : 
b k 
te Ot): (29) 


First, the coefficient of y” becomes 1. Second, replacing k/m by w? gives it meaning. 
Third, the input F’ has the same units as the output y. So now the gain G = |y|/|F| is 
dimensionless. This happened because the original f(t) with unsuitable units was 
replaced by kF'(t)—which is now divided by m. 

Most valuable of all is a new way to write the damping term b/m, which is B/A. 
The key point is that b? and mk have the same dimensions. From the equation, 
my" and by’ and ky have the same dimensions. Then so do (by’)? and (my”)(ky). And 
also (y’)? and (y”)(y)—they both contain 1/(time)?. This leaves b? and mk. 

This quantity Z = b/\/4mk is highly useful. Overdamping is Z > 1. Underdamping 
is Z < 1. The coefficient b/m in equation (29) has a better form 2Zw,, in (30). 


b 2b «6/k 
—_= —=2Z ” ’ Qe sd (30) 
m  V4mk Vm wn y+ 2Zony! + wry = w, F(t) 


Z is the damping ratio. The correct symbol is a Greek zeta (¢). But a capital zeta = Z 
is so much easier to read and write. (The MATLAB command is also named zeta.) 
Watch how this ratio of B to W4AC brings out the important parts of every formula. 
If Z < 1, the natural frequency w,, is reduced to the damped frequency wg = wnWV1 — Z?. 


Roots 5; and 82s? + 2Zwns + wn? = 0 gives s = —Zwn tunVZ2—-1 > GD) 


‘ b? ‘ 
Underdamping 2? = Dak <1 and s = —Zw, + iwg (32) 
Null solutions Yn(t) = e~ Zent (ci cos wat + c2 sin wat) (33) 


The null solutions are not pure oscillations. They include the exponential a ene, 


Their frequency changes to wag. The graph of y(t) oscillates as it approaches zero, and the 
peak times when y = Ymax are spaced by 27/wa. 
The page after Problem Set 2.4 collects our solution formulas in one place. 
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= REVIEW OF THE KEY IDEAS #® 


. A particular solution to Ay” + By’ + Cy = e* is e®/(As? + Bs +C). 

. This is a constant coefficient equation P(D)y = e® with solution yp = e°t/P(c). 
. Resonance occurs if e“ is a null solution of P(D)y = 0. This means that P(c) = 0. 
. Resonance leads to an extra t: yp(t) = tet /P’(c) when P(c) = 0 and P’(c) #0. 
. For second order equations with f = coswt the gain is G = 1/|P(iw)| = 1/VD. 
. The real solution is M coswt + N sinwt = Gcos(wt — a) with tana = N/M. 

. With damping ratio Z = B/V4AC, the equation is y"” + 2w,Zy! + w2y = w? F(t). 


. If Z < 1, the damped frequency is wy = wnV1 — Z?. Then 81, 2 are —Zwy, + twa. 


Problem Set 2.4 


Problems 1-4 use the exponential response yp = ect /P(c) to solve P(D)y = e™. 


1 


Solve these constant coefficient equations with exponential driving force: 
(a) Yy + 3yy + 5yp = e* (b) 2yy + 4yp=e* = (c) y'"" =e! 
These equations P(D)y = e“ use the symbol D for d/dt. Solve for y,(t) : 
(a) (D2 1), (tS 10e-* (b) (D? +2D + 1)y,(t) = e? 
(cy (D* 4D? + lat) =e" 
How could y, = e*/P(c) solve y” + y = e'e™ and then y” + y = e' cost ? 


(a) What are the roots 5; to s3 and the null solutions to y/"’ — y, =0? 


(b) Find particular solutions to y/)’ — yp = e** and to yf" — yp = e* — ett, 


Problems 5-6 involve repeated roots s in y,, and resonance P(c) = 0 in yp. 


5 


Which value of C gives resonance in y”+C'y = e*”' ? Why do we never get resonance 
iny” + 5y/+Cy =e! ? 


Suppose the third order equation P(D)yn = 0 has solutions y = c,e* + c2e”" + c3e*". 
What are the null solutions to the sixth order equation P(D)P(D)yn = 0? 
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7 


10 


11 


12 


13 


14 


15 


16 


Complete this table with equations for roots s; and sz and solutions y,, and y, : 


Undamped free oscillation my” +ky =0 Yn = 
Undamped forced oscillation my" + ky = et? a 
Damped free motion my" + by’ +ky =0 os 
Damped forced motion my" + by’ +ky = et a 


Complete the same table when the coefficients are 1 and 2Zw,, and w? with Z < 1. 


Undamped and free y +wrpy= a 
Undamped and forced y" +uzy = et ss 
Underdamped and free y" +2Zwny’+wty=0 Yn=___ 
Underdamped and forced y" + MZuny! +uw2y=e* Yop = __ 


What equations y” + By’ + Cy = f have these solutions ? 

(a) y = c1 cos 2t + cg sin 2t + cos 3t 

(b) y = c,e~* cos 4t + cge~! sin 4t + cos 5t 

(c) y= que? + cote? + et? 
If yp = te~®cos7t solves a second order equation Ay” + By’ + Cy = f, 
what does that tell you about A, B, C, and f ? 

(a) Find the steady oscillation y,(t) that solves y” + 4y’ + 3y = 5cosuwt. 

(b) Find the amplitude A of y,(t) and its phase lag a. 

(c) Which frequency w gives maximum amplitude (maximum gain) ? 


Solve y” + y = sinwt starting from y(0) = 0 and y’(0) = 0. Find the limit of y(t) 
as Ww approaches 1, and the problem approaches resonance. 


Does critical damping and a double root s = 1 in y” + 2y’+y = e@ produce an extra 
factor t in the null solution y,, or in the particular y, (proportional to e“*) ? What is yn 
with constants c,, C2? What is yp = Ye"? 


If c = iw in Problem 13, the solution y, to y”+2y’+y = eis ___. That fraction 
Y is the transfer function at iw. What are the magnitude and phase in Y = Ge~** ? 
By rescaling both t and y, we can reach A= C =1. Then w, = 1 and 
B = 2Z. The model problem is y” + 2Zy’ + y = f(t). 


What are the roots of s? + 2Zs + 1 = 0? Find two roots for Z = 0, + d.e:2, 
and identify each type of damping. The natural frequency is now w, = 1. 


Find two solutions to y”” + 2Zy’ + y = 0 for every Z except Z = 1 and —1. Which 
solution g(t) starts from g(0) = 0 and g'(0) = 1? What is different about Z = 1? 
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The equation my” + ky = coswyt is exactly at resonance. The driving frequency 
on the right side equals the natural frequency w, = ,/k/m on the left side. 


Substitute y = Rtsin(,/k/mt) to find R. This resonant solution grows in time be- 
cause of the factor ¢. 


Comparing the equations Ay” + By'+Cy = f(t) and 4Az"4+Bz'+(C/4)z = f(t), 
what is the difference in their solutions ? 


Find the fundamental solution to the equation g” — 3g’ + 2g = 6(t). 


(Challenge problem) Find the solution to y” + By’ + y = cost that starts from 
y(0) = O and y’(0) = 0. Then let the damping constant B approach zero, to reach the 
resonant equation y” + y = cos t in Problem 17, with m =k =1. 


Show that your solution y(t) is approaching the resonant solution $¢ sin ¢. 


Suppose you know three solutions yi, y2, y3 to y” + B(t)y’ + C(t)y = f(t). 
How could you find B(t) and C(t) and f(t)? 
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Solution Page Linear Constant Coefficient Equations 
: dy dy dy 
—_ = A = 
First order me oY + f(t) Second order 712 +B aE +Cy = f(t) 
dNy dy N 
Nth order AN GN +---4+Ay = + Agy = (Ay D™ +--+ Ao)y = P(D)y = f(t) 


Null solutions yn have f(t) = 0 Substitute y = et to find the N exponents s 


d 
First order ae) = aes 8 =aand yn, = ce 
Second order As? + Bs+C=0 Yn = cye5!* + e952! 
Nth order P(s) =) Yn = cyeStt oo cyedNt 
Exponential response to f(t) = ect Step response forc = 0 Look for y = Yect 

ct 
, ; 1 

First order eS (ve) —aYet*=e% y= <— has Y = 

dt c-—a cSa 

ect 
Second order Y(Ac? + Bc+C)e™ = e@ = —~—____ = ye@ 
( 4 “Act + Be+C 
ect tect 


Nth order Y P(c)e™ = e@ Up = Sry 


Fundamental solution g(t) = Impulse response when f(t) = 6(t) 


First order g(t) = e% starting from g(0) = 1 

esit = eS2t 
Second order g(t) = —————~ starting from g(0) = Oand g/(0) =1/A 

A(s1 — 82) 

sin Wnt — Zw, ¢ Sin wat 

t= = nb a" 

Undamped g(t) ie underdamped g(t) =e rie 
Nthorder g(t) = yn(t) 4(0) = g/(0) =... =0,9%-(0) = 1/Ay 


Very particular solution for each driving function f(t) : zero initial conditions on yyp 
t 


Multiply input at every time s = = 
by the growth factor over t — s UO) Ge =) 5 (8) as 
0 


Undetermined coefficients Direct solution for special f(¢) in Section 2.6 
Variation of parameters Yyp(t) comes from yn(t) in Section 2.6 

Solution by Laplace transform Transfer function = transform of g(t) in Section 2.7 
Solution by convolution y(t) = g(t) « f(t) in Section 8.6 
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2.5 Electrical Networks and Mechanical Systems 


Section 2.4 solved the equation Ay” + By’ + Cy = coswt. Now we want to understand the 
meaning of A, B,C in real applications. This is the fundamental equation of engineering for 
a one-unknown system, when the forcing function is a sinusoid. It is a perfect opportunity to 
use the transfer function. This connects the input to the response. 

For mechanical engineers the unknown y gives the position of one mass—oscillating 
or rotating or vibrating. For electrical engineers the unknown y is the voltage V(t) or the 
current J(¢) in a one-loop RLC circuit. Those letters R, L, C represent a resistor, an inductor, 
and a capacitor. For a chemical engineer or a scientist or an economist the equation is a 
model of ..... I have to stop or this presentation will go out of control. 

The great differential equations of applied mathematics are first order or second order. 
The equations we understand best are linear with constant coefficients. 

In later chapters the single unknown becomes a vector. Its coefficients become square 
matrices in dy/dt = Ay and d?y/dt? = —Sy. We have a system of n equations 
for voltages at nodes or currents along edges or positions of n masses. Linear algebra 
will organize the equations and their solutions. Matrix differential equations give us the 
right language to express applied mathematics. 

Our goals are to find and solve the equations for y(¢) in real applications. These are 
balance equations: balance of forces and balance of currents. Flow in equals flow out. 


Spring-Mass-Dashpot Equation and Loop Equation 


In mechanics, y and y’ and y” are the position, the velocity, and the acceleration. The 
numbers A, B,C represent the mass m, the damping b, and the stiffness k : 


Newton’s Law F = ma my” + by’ + ky = applied force. (1) 


The picture in Figure 2.12 shows the mass m attached to a spring and also a dashpot. 
Those two are responsible for the forces —ky and —by’. The stretched spring pulls back 
on the mass. By Hooke’s Law that force is —ky. The damping force comes from a dashpot 
(old-fashioned word, key idea). You could visualize the mass moving in a heavy liquid 
like oil. The friction force is —by’, proportional to velocity and in the opposite direction. 


For an electrical network, it was Kirchhoff and not Newton who provided the balance 
equations. Kirchhoff’s Voltage Law says that the sum of voltage drops around any 
closed loop is zero. The current is /(t) and we start with one loop: 


dI 1 
Voltage law KVL : L aE + RI + C if I dt = applied voltage. (2) 
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<—____ » —__> 


Figure 2.12: Three forces enter F = my”: spring force ky, friction by’, driving force f. 


The numbers L, R, C are the inductance, the resistance, and the capacitance. (Unfortunately 
we divide by the capacitance C’. In the end the equation has constant coefficients and regard- 
less of the letters we solve it.) To produce a second order differential equation for I(t), and 
to remove the integration in equation (2), take the derivative of every term: 


1 
Loop equation for the current I(t) LI" + Ri’ + = I=Fcoswt. (3) 


That force F’ cos wt comes from a battery or a generator, when we close the switch. We will 
be looking for a particular solution [,,(¢). That solution is produced by the applied force. 
We are not looking at initial conditions and y,,(¢). Those null solutions y,, are transient, with 
f = 0. They die out exponentially fast. 


| source f (¢) 


capacitance C inductance L 


current I(¢) 


resistance R 


Figure 2.13: A one-loop RLC circuit with a source and a switch. 
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The Mechanical-Electrical Analogy 


Both applications produce second order equations Ay” + By’ + Cy = f(t). This means 
we can solve both problems at once—not only mathematically but also physically. We can 
predict the behavior of a mechanical system by testing an electrical analog, when simple 
circuit elements are more convenient to work with. The basic idea is to match the three 
numbers m, b, k with the numbers L, R, and 1/C. 


Mechanical System Electrical System 

Mass m «+— Inductance L 

Damping constant b «— Resistance R 

Spring constant k «— Reciprocal capacitance 1/C 


Natural frequency w2 =k/m ¢— Natural frequency w2 = 1/LC 


Before solving for the loop current J(t), let me outline three solution methods—our past 
method, our present method, and our future method. 


cos wt to et to Y(w) 


Past method Section 2.4 solved Ay” + By’ + Cy = F coswt. The equation was real 
and the solution was real. That solution had a sine-cosine form and also an amplitude-phase 
form: 


y(t) = Mcoswt + Nsinwt = Gcos(wt — a). (4) 


The connections between inputs F' and outputs M,N came by substituting y(t) into the 
differential equation and matching terms. Then G? = M? + N? and M = Gcosa. 


Present method Instead of working with cos wt and sin wt, it is much cleaner to work with 
a complex input Ve*?. Then the output (the current) is a multiple of Ve**'. 
That multiple Y is a complex number. It tells us amplitudes and also phase shifts. 


This is the right way to see the response of a one-loop RLC circuit. When the input 
frequency is w, the output frequency is also w. 
i dI 1 ; 
Equation L a + RI + a J Idt = applied voltage = Ve" (5) 


Vet ___ input 
iwL+ R+1/iw’ z= impedance 


Solution I(t) = (6) 


We will study that complex impedance in detail. 


Future method Once we see the advantages of a complex e*”*, we won’t go back. 
What we are really doing is to change a differential equation for y in the time domain 
into an algebraic equation for Y in the frequency domain : 


Sety = Ye! Ay" + By'+ Cy =e becomes (#?w?A+iwB+C)Y =1. 
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Derivatives of y(t) become multiplications by iw. We are talking here about the most 
important and useful simplification in applied mathematics. It requires constant coefficients 
A, B,C. This allows us to factor out e**. 

The transfer function Y (s) takes two more steps from derivatives to algebra. First, 
it changes e’“* to e®’. That exponent s can be pure imaginary (s = iw). It can also be 
any complex number (s = a+ iw). We recover the freedom of Chapter 1, to allow 
growth or decay from a > 0 ora < 0. We are interested in all s and not just the 
special s; and sz that came from solving As? + Bs +C = 0. 

The exponentials e*!' and e*2¢ went into the transient solution y,(t). Now we are 
working with the long-time solution y,(t) coming from an applied force F'e*’. 

The second contribution of the transfer function is to give a name to the all-important 
multiplier in the system. It multiplies the input to give the output. 


1 
~ As? + Bs+C° 


Derivatives and integrals become multiplications and divisions (by s). One more name is 
needed. Y(s) is the Laplace transform of the impulse response g(t). 


The transfer function is Y (s) The output is Y(s) times e**. 


Input f = d(t) Output y = g(t) = impulse response Transform Y (s) 


Input f = step Output y = r(t) = step response Transform Y (s)/s 


The step function is the integral of the impulse 6(¢). The step response is the integral 
of the impulse response g(t). For their Laplace transforms, integration becomes division 
by s. Calculus in the time domain becomes algebra in the frequency domain. 

The rules for the transforms of dy/dt and fy(t)dt, and also a table 
of inverse Laplace transforms to recover y(t) from Y(s), will come in Section 2.7. 


Complex Impedance 


The present method uses Ve" for the alternating current input. The output divides that 
input by the impedance Z. This is like Ohm’s Law I = E/R, but the resistance R 
changes to the impedance Z for this RLC loop: 


Vemt Vem input 


~ iw +R +ViwC Z impedance ) 


Current I(t) 

The complex impedance Z depends on w. The real part of Z is the resistance R. 

The imaginary part of Z is the “reactance” wl — 1/wC’. From those rectangular coordi- 
nates Re Z and Im Z, we know the polar form |Z|e’® of this complex number : 


Magnitude Z| Sa RAE (Ob 1/6)? (8) 
ImZ wlh-1/wC 
= = 9 
Phase angle tana ReZ R (9) 
V dwt 
Loop current TMby = sues ei(wt—a) (10) 


Z |4| 
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The phase angle a tells us the time lag of the current behind the voltage. 

Remember that R is the damping constant, like the coefficient B in Ay” + By’ + Cy. 
In the language of Section 2.4, we have forced damped motion. The damping keeps us 
away from exact resonance with the natural frequency of free undamped motion—which 
has wl = 1/wC and w = 1/VLC. The magnitude |Z| is smallest and V/|Z| 
is largest at that natural frequency. We tune a radio to this w to get a loud clear signal. 


Example 1 Suppose the RLC circuit has resistance R = 10 ohms and inductance L = 0.1 
henry and capacitance C = 10~* farad. The units of R and wL and 1/wC must agree. Since 
frequency w is measured in inverse seconds, all three units can be given in terms of V = 
volts and A = amps (for current) and seconds: 


R OhmQ =V/A = 1 volt per amp 
L HenryH =V-sec/A =1 volt-second per amp 
C Farad F =A-sec/V =1amp-second per volt 


Example 2 Find the impedance Z, its magnitude |Z|, and the phase angle a for an RLC 


loop when the frequency is w = 60 cycles/second = 60 Hz = 1207 radians/second. 
1 a 
The impedance of this loop is Z=R+ilwLlL—- ai Ne [Zle=o: 
Ww 
The magnitude of the impedance is VA eee 


The phase angle producing time delayis a=... 


Example 3 To tune a radio to a station with frequency w, what should be the 
capacitance C’ (which you adjust) ? Suppose R and L are fixed and known. 


Solution The goal of tuning is to achieve wL = 1/wC. Then the imaginary part of Z 
is zero: inductance cancels capacitance. Tuning achieves Z = R, that real part RF is fixed. 


Example 4 Suppose the network contains two RLC branches in parallel. Find the 
total impedance Z;2 from the impedances Z; and Z> of the two separate branches. 


— 
we 
iw) 
II 
— 
a 
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Loop Equations Versus Node Equations : KVL or KCL 


Equation (2) expressed Kirchhoff’s Voltage Law. The sum of voltage drops around a 
closed loop is zero. In principle, we could find a set of independent loops in any larger 
electrical network. Then the Voltage Law will give an equation like (2) around each of 
the independent loops. Those loop currents determine the currents on all the edges of the 
network and the voltages at all the nodes. 

Most codes to solve problems on large networks do not use the voltage law! The 
preferred approach is Kirchhoff’s Current Law: The net current into each node is zero. 
The balance equations of KCL say that “current in = current out” at every node. 

Let me illustrate nodal analysis using the network in Figure 2.14. The unknowns 
are the voltages V, and V2. The currents are easy to find once those voltages are known. 


Vi Vi V2 


Current 
source 


Tet 


0 0 0 


Figure 2.14: Four currents in and out of Node 1. Node 2: Current in, current out. 


A problem of this size can be solved symbolically or numerically : 


Symbolically | Work in the s-domain and find the transfer function. Since R, is in 
parallel with L, and Rg is in series with C’, we can find the currents on all the 
edges in terms of V, and V2. Here is Kirchhoff’s Current Law at those nodes: 


Vi Yu Y-vVe w%-V 
peel eas =f d 
R, Ls Ro a Ro 


+sCV2=0 (11) 


Numerically Assign values to Rj, L, Re, C and w. Compute V; and V2 from 
current balance at the nodes. Compute the currents from V,/R, and V2/iLw. 


For a larger network, the algebra in the s-domain (tw domain) becomes humanly impos- 
sible. A symbolic package could go further but in the end (and for nonlinear networks) the 
numerical approach will win. Widely known codes developed from the original SPICE code 
created at UC Berkeley. The SPICE codes use nodal analysis instead of loop analysis, for 
realistic networks. 

Computational mechanics faced the same choice between nodal analysis and loop 
analysis. It reached the same conclusion. A complicated structure is broken up into 
finite elements—small pieces in which linear or quadratic approximation is adequate. 


124 Chapter 2. Second Order Equations 


The choice is between displacements at nodes or stresses inside the elements, as the pri- 
mary unknowns. The finite element community has made the same decision as the circuit 
simulation community: Work with displacements (and work with voltages) at the nodes. 


A network produces a large system of equations—linear equations with simple RLC 
elements and nonlinear equations for circuit elements like transistors. The nodes connected 
by the edges form a graph. To organize the equations, you need the basic concepts of graph 
theory in Section 5.5: 


An incidence matrix A tells which pairs of nodes are connected by which edges. 


A conductivity matrix C' expresses the physical properties along each edge. 


Then the overall conductance matrix is K = ATCA. The system we solve, for linear 
problems in circuit simulation and in structural mechanics, has the matrix form Ky = f. 


Chapter 4 will explain matrices and Section 5.5 will focus on the incidence matrix A 
of a graph. Those are necessary preparations for Kirchhoff’s Current Law at all the nodes. 
Then Sections 7.4 and 7.5 create the stiffness matrix K (for mechanics) and the graph 
Laplacian matrix (for networks) : basic ideas in applied mathematics. 


Step Response 


This book has emphasized the two fundamental problems for differential equations. 
One is the response to a delta function. The other is the response to a step function. 
For second order equations the impulse response g(t) was computed in Section 2.3. 
This is our chance to find the step response, and we have to take it. 


The two responses are closely related because the two inputs are related. The delta 
function is the derivative of the step function H(t). The step function is the integral of the 
delta function. For constant coefficient equations, we can integrate every term. The integral 
of the impulse response g(t) is the step response r(t). 


Impulse response g(t) Ag” +Bg'+Cg =Cé(t) (12) 


Step response r(t) Ar" + Br'+Cr =CH(t) (13) 


We are following the “better notation” convention that includes the coefficient C on the 
right hand side. Its purpose is to give the output y or g or r the same units as the forcing 
term. Then the gain G = |output/input| is dimensionless. For the step function with 
input H(t) = 1, the steady state of the step response will be r(co) = 1. 


I see two ways to compute that step response. One is to integrate the impulse response. 
The other is to solve equation (13) directly. The particular solution is rp(t) = 1. The 
null solution is a combination of e*?* and e*2*, using the two roots of As? + Bs + C = 0. 
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To be safe, it seems reasonable to find r(t) both ways. 


CG edit =, es2t 
Method 1 Integrate the impulse response g(t) = ———————— (14) 


AD ay. = 85 
Method 2 Solve Ar” + Br’ +Cr=C with r(0) =r’(0) =0. (15) 


Computing the Step Response 


Method 2 is the normal way to solve differential equations. Substitute e** to find s; 
Null solutions e°* As? + Bs +C =0 has roots s; and so. 
The complete solution to Ar” + Br’ + Cr = C is particular + null: 
r(t) =1+ce%* + ce". (16) 


The step response starts from r(0) = 0 and r’(0) = 0. A switch is turned on at t = 0, 
and the solution rises to r(co) = 1. The conditions at t = 0 determine c, and c2: 


r(0) =1l+a+c=0 r’(0) = C18, + Co82 = 0. (17) 


Those coefficients are c) = s2/(s, — $2) and cp = —s,/(s1 — $2). Then we know r(t): 


Step response 7(t)=1+ — sye%"), (18) 


The same answer must come from integrating g(t) in equation (14) from 0 to t. 
Remember that the roots of any quadratic multiply to give s1s2 = C/A. 


(19) 


sis psit _j sot _ 
Step response = integralof g(t) (i) = — $2 ; e | . 


81 — 82 $1 $2 


The coefficient of e*!* is the same s2/(s; — s2) as in (18). Similarly for the coefficient of 
e°*, The constant term equals 1, so (18) and (19) are the same: 


$182 1 1 $182 81 — 82 
ees ee =i, 
$1 — S82 §] $2 81 — $2 $1892 


Better Notation 


Our formula for the step response r(t) can’t stop with equation (18). Those roots s; and s2 
will depend on the physical parameters A, B, C’. In mechanics these numbers are m, ), k. 
For a one-loop network the numbers are L, FR, 1/C’. We need to express r(t) with numbers 
we know, instead of s; and 89. 
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Remember that combinations of A, B, C are especially useful. The simplest choices are 
p= B/2Aandu2: 


B Cc C 
ieee nea ery becomes rv’ + 2pr’ + w2r = w2. (20) 


The same exponents s, and s2 are now roots of s? + 2ps + w2 = 0. Suppose p < Wn: 
Null solutions e** —.s, 82 = —p+ \/p? —w? = —p+ iwg. (21) 


Substituting for s; and s2 in equation (18) gives a beautiful expression for r(t) : 


Step response r(t) = 1 — “"e-?* sin(wat + ¢). (22) 
Wa 


That angle ¢ is in the right triangle that connects wy, to p and wg: 


w 
is Wa 2 es 2 ° _ Wd = oan Pp 
wot pe = wr Suis Oe Oa x 
nm nm 


Pp 


Now we check that r(0) = 0 and r’(0) = 0—then formula (22) must be correct: 


r(0)=1-“sind=0 —r'(0) = ““(psing — wacos) = 0. 
Wq Wd 


That final solution (22) combines e~”* sinwgt and e~?' coswgt. This null solution is a 
combination of e*'* and e%? with s = —p + iwg, as required. The particular solution is 
r(oo) = 1. We see this steady state appear when the transients decay to zero with e~?*. 
The step response rises to 1. 

The number p = B/2A can be replaced by w,, times the damping ratio, if preferred. 


Practical Resonance: Minimum D, Maximum Gain 


The gain is 1/ VD. If D is small then the gain is large. That is how you tune a radio, 
by choosing the frequency w,-; that minimizes D and maximizes G. Then you can hear 
the signal. It is not perfect resonance—the gain does not become infinite—but it is 
resonance in practice. 


Practical resonance Minimize D=(C- Aw?)* + (Bw)? 


Derivative of D is zero —4Aw(C — Aw?) + 2B?w = 0. 


When you cancel w and solve 2B? = 4A(C — Aw’), that gives the frequency wWres with 
largest gain. When B = 0 this is the natural frequency w,, with infinite gain: Aw? = C. 
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For 2Z2 < 1 there is practical resonance when 2B? = 4A(C — Aw?) at wres: 


B2 B? 
Largest gain ww? = S rs < (: - ) = w? (1 — 227). 


= REVIEW OF THE KEY IDEAS ® 


1. L,R,C in LI" + RI‘ + 4I = e'* are the inductance, resistance, capacitance. 
2. For networks, node equations replace that loop equation: KCL instead of KVL. 
3. The response to a step function rises from r(0) = 0 to a steady value r(oo) = 1. 

4. Practical resonance (the maximum gain) is at the frequency Wres = Wn J. Le 


Important note We computed the step response r(t) in the time domain. Using the Laplace 
transform in Section 2.7, this computation can be moved to the s-domain. The 
transform of a unit step is 1/s. Derivatives in t become multiplications by s : 


Cc 
The state equation Ar ’”+ Br’+Cr = C transforms to (As?+ Bs+C)R(s) = —. 
8 
The problem is to find the inverse Laplace transform r(t) of this function R(s). There are 
excellent control engineering textbooks that leave this as an exercise in partial fractions. 
The time domain (state space) solution in this section reached r(t) successfully. 


Problem Set 2.5 


1 (Resistors in parallel) Two parallel resistors R; and Rz connect a node at voltage V 
to a node at voltage zero. The currents are V/R; and V/ Rp. What is the total current 
I between the nodes? Writing Rj for the ratio V/I, what is R12 in terms of R; and 
Ro? 


2 (Inductor and capacitor in parallel) Those elements connect a node at voltage Ve to 
a node at voltage zero (grounded node). The currents are (V/iwL)e** and 
V(iwC)e*. The total current Je“** between the nodes is their sum. Writing 
Z12 for the ratio Ve /Te*, what is Z12 in terms of iwL and iwC'? 


3 The impedance of an RLC loop is Z = iwL + R+ 1/iwC. This impedance Z is real 
when w = . This impedance is pure imaginary when . This impedance 
is zero when 


4 What is the impedance Z of an RLC loop when R = L = C = 1? Draw a graph that 
shows the magnitude |Z| as a function of w. 
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Why does an LC loop with no resistor produce a 90° phase shift between current 
and voltage ? Current goes around the loop from a battery of voltage V in the loop. 


The mechanical equivalent of zero resistance is zero damping: my” + ky = coswt. 
Find c; and Y starting from y(0) = 0 and y/(0) = 0 with w2 = k/m. 


y(t) = ce, coswynt + Y coswt. 


That answer can be written in two equivalent ways: 
nee) = ) sin (ene) ue ) ; 


y = Y(coswt — coswyt) = 2Y sin 5 


Suppose the driving frequency w is close to w, in Problem 6. A fast oscillation 
sin[(wp + w)t/2] is multiplying a very slow oscillation 2Y sin|(w, — w)t/2]. 
By hand or by computer, draw the graph of y = (sint)(sin9t) from 0 to 27. 


You should see a fast sine curve inside a slow sine curve. This is a beat. 


What m, b, k, F equation for a mass-dashpot-spring-force corresponds to Kirchhoff’s 
Voltage Law around a loop? What force balance equation on a mass corresponds to 
Kirchhoff’s Current Law ? 


If you only know the natural frequency w,, and the damping coefficient b for one 
mass and one spring, why is that not enough to find the damped frequency wg? 
If you know all of m, b, k what is wg? 


Varying the number a in a first order equation y’ — ay = 1 changes the speed of the 
response. Varying B and C in a second order equation y” + By’ + Cy = 1 changes 
the form of the response. Explain the difference. 


Find the step response r(t) = yp + Yn for this overdamped system : 
r” +2.5r'+r=1 with r(0)=0 and r/(0) =0. 


Find the step response r(t) = yp + Yn for this critically damped system. The double 
root s = —1 produces what form for the null solution ? 


er 42r'+r=1 with r(0) =0 and r’(0) =0. 

Find the step response r(t) for this underdamped system using equation (22) : 
r” +r’ +r=1 with r(0)=0 and r’/(0) =0. 

Find the step response r(t) for this undamped system and compare with (22): 


r’+r=1 with r(0)=0 and r’(0) =0. 


For b? < 4mk (underdamping), what parameter decides the speed at which the step 
response r(t) rises to r(oo) = 1? Show that the peak time is T = a/wg when 
r(t) reaches its maximum before settling back to r = 1. At peak time r’/(T’) = 0. 
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16 = If the voltage source V(t) in an RLC loop is a unit step function, what resistance R 
will produce an overshoot to Tmax = 1.2 if C = 10~© Farads and L = 1 Henry ? 
(Problem 15 found the peak time T when r(T’) = Tmax). 


Sketch two graphs of r(t) for py < po. Sketch two graphs as wa increases. 
17 What values of m, b, k will give the step response r(t) = 1 — /2e~* sin(t + #)2 


18 What happens to the p — wg — wr», right triangle as the damping ratio w,,/p increases 
to 1 (critical damping)? At that point the damped frequency wy becomes . The 
step response becomes r(t) = 


19 Theroots s1,s2 = —p + iwgare poles of the transfer function 1 /(As?+Bs+C) 


Show directly that the product of the roots s; = —p + twg and so = —p — twzg is 
$182 = w2. The sum of the roots is —2p. The quadratic equation with those roots 
is s? + 2ps + w2 = 0. 


Imaginary axis 


Real axis 
Ney of radius w,, 
2 


20 Suppose p is increased while w,, is held constant. How do the roots s; and sz move? 


21. ~Suppose the mass m is increased while the coefficients b and k are unchanged. What 
happens to the roots s; and sz? 


22 Rampresponse How could you find y(t) when F’ = tis a ramp function? 
y” + 2py’+w2y =w2t starting from y(0) =0 and y’(0) =0. 


A particular solution (straight line) is y, = . The null solution still has the 
form yn = . Find the coefficients c; and Cc, in the null solution from the two 
conditions at t = 0. 


This ramp response y(t) can also be seen as the integral of : 
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2.6 Solutions to Second Order Equations 


Up to now, all forcing terms f(t) for second order equations have been e* or coswt. 
How can you find a particular solution when f(t) is not a sinusoid or exponential? This 
section gives one answer for constant coefficients A, B,C and then a general answer VP: 


UC If f(t) is a polynomial in t, then y,(t) is also a polynomial in t. 
VP_ Suppose we know the null solutions yp =c1y1(t) + coy2(t). Then 
a particular solution has the form yp, = ci(t)yi(t) + co(t)ye(t). 


Those methods are called “undetermined coefficients” and “variation of parameters”. 


The special method is simple to execute (you will like it). When f(t) is a quadratic, 
then one solution is also a quadratic: y,(t) = at? + bt + c. Those numbers a, b,c are the 
undetermined coefficients. The differential equation will determine them. This succeeds 
for any constant coefficient differential equation—always limited to special f(t). 

That method UC can be pushed further. If f(t) is a polynomial times an exponential, 
then y,(t) has the same form. The highest power of t allowed in y, is the same as in f. 
Those polynomials normally have the same degree. 

Only in the case of resonance must we allow an extra factor ¢ in the solution. This is like 
the exponential response to f(t) = e in Section 2.4. That presented a perfect example of 
an undetermined coefficient Y in y,(t) = Ye**. The coefficient Y = 1/(As? + Bs + C) 
was determined by the equation. This is Y = 1/P(s) for all equations P(D)y = e*. 
With resonance we move to yp = te®’/P'(s). 

Variation of parameters is a more powerful method. It applies to all f(t). It even 
applies when the equation A(t)y” + B(t)y’ + C(t)y = f(t) has variable coefficients. But 
it starts with a big assumption: We have to know the null solutions y; (¢) and y(t). 

The method will succeed completely when the coefficients A, B,C are constant. This 
important case gives formula (17). Variation of parameters also succeeded in Chapter 1, 
for first order equations y’ — a(t)y = q(t). In that case we could solve the null equation 
y’ = a(t)y. For second order equations with variable coefficients, like Airy’s equation 
y” = ty, the null equation is a difficult obstacle. 

I guess we have to realize that not all problems lead to simple formulas. 


The Method of Undetermined Coefficients 


This direct approach finds a particular solution y,, when the forcing term f(t) has a 
special form. I can explain the method of undetermined coefficients by four examples. 


Example1 y’” + y = #* has a solution of the form y = at? + bt + c. 


The reason for this choice of y is that y’ and y” will have a similar form. They will also be 
combinations of t? and ¢ and 1. All the terms in y” + y = t? will have this special form. 
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Choose the numbers a, }, c to satisfy that equation : 
y” +y = (at? + bt +c)” + (at? + bt +c) = t?. (1) 
Key idea: We can separately match the coefficients of t? and t and 1 in equation (1): 
(7) @=1 @)6=0 (@) 22+¢=0 (2) 
Then c = —2a = —2 and the answer is y = at? + c = t? — 2. This solves y” + y = t?. 


Example 2 Find the complete solution to y” + 4y’ + 3y =e7* +t. 
Answer First find the null solution to yn" + 4yn’ + 3yn = 0, by substituting y, = e%*: 


(s? + 45+ 3)e% —0 leadsto s?+4s+3—=(s+1)(s+3) =0. 

The roots are s; = —1 and sg = —3. The null solutions are y,, = cje~* + coe, 
Now find one particular solution. With f = e~’ +t, the usual form with undetermined 
coefficients would be yp = ae~’ + bt + c (notice c in the polynomial). But e~* is a 


null solution. Therefore the assumed form for y needs an extra factor t multiplying e~°. 


Substitute y = ate—*+ bt-+ c into the differential equation, so y’ = ae~* — ate-t +b: 
y” +4y' + 3y = (—2ae~* + ate~*) + 4(ae~* — ate~* + b) + 3(ate~* + bt +c) =e +t. 


The coefficients of te~* are a — 4a + 3a = 0. No problem with this te~* term. We must 
balance the coefficients of e~* and t and 1: 


Find a, b,c —2a+4a=1 3b=1 4b4+3c=0 
Then a = 4 and b = 3 andc = —4 produce the particular y, = tet zt 4. 
The null solution is cje~* + cge~**. The complete solution is always y = Yp + Yn. 


The method only applies to very special forcing functions, but when it succeeds it is as 
fast and simple as possible. Let me list special inputs f(t) and the form of a solution 
y(t) when the differential equation Ay” + By + Cy = f(t) has constant coefficients. 


1. f(t) = polynomial in t y(t) = polynomial in t (same degree) 
2. f(t) = Acoswt+ Bsinwt y(t) = M coswt + Nsinwt 

3. f(t) = exponential e** u(t) =xer 

4. f(t) = product t? e%* y(t) = (at? + bt +c) e* 


t? eS is included in 4 by multiplying possibilities 1 and 3. The good form for y(t) 
multiplies the solutions to 1 and 3. The coefficients M@, N, Y, a, b, c are “undetermined” 
until you substitute y(t) into the differential equation. That equation determines a, b, c. 


Note to professors It seems to me that a polynomial times et” shares the key property. 
Its derivatives have the same form. But their polynomial degree goes up. Not good. 
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Example 3. Find a particular solution to y” + y = teSt = polynomial times e°°. 
The good form to assume for y(t) is (at + 6) e%’. Please notice that be* is included. 
Even though f doesn’t have e* by itself, that will appear in the derivatives of t e%’. 
To be sure we capture every derivative, at + b must include that constant b. 
I need to find the second derivative of the undetermined y(t) = (at + b) e**. 

y’ = s(at + b)e +ae™ y” = s*(at + b) e + 2ase™. 


Substitute y and y” into the equation y” + y = te* and match terms to find a and b: 


Coefficient of t e*! as? +a = 1 
Coefficient of e** bs? + 2as +b = 0 
Those two equations produce a= — and b= eet x ee = (3) 
: e ~ 1482 ~ 14s? (14s8?)2" 


Now y(t) = (at + b) e® is a particular solution of y” + y =te®. 
Possible difficulty of the method Suppose s = i or —7 in the forcing term f = t e** 


Those exponents s = i and s = —i have 1 +s? = 0. Our answer in (3) for a and b is dividing 
by zero. The result is useless. What went wrong ? 


Explanation If s = i, the assumed form y = (at + b)e” includes a solution be 
of y” + y = 0. We have accidentally included a null solution y, = be*’. There is no 
hope of determining b. That coefficient is truly undetermined and it stays that way. 

We are seeing a problem of resonance, when the hoped-for y, is already a part of y,. The 
result in Section 2.4 was that resonant solutions have and need an extra factor t. The same 
is true here. When s = 7 or s = —i, the good form to assume is yp = t(at + b) e*. 


When you substitute this y, into y”’ + y = te®*, the coefficients a and 6 will be 
properly determined. If s = 7, could you verify that a = —1/4 and b = 1/4? 


Example 4 Let me apply “undetermined coefficients” to an equation you already know : 
Ay” + By’ + Cy = cos wt. (4) 


Solution by undetermined coefficients Look for y(t) = M cos wt + Nsinwt. Those 
coefficients M/ and N are also in equation (21) of Section 2.4. 

C-— Aw? Bw ox ‘ 

D D D = (C — Aw*)* + Bew 

Is this perfect? Not quite. In case the denominator is D = 0, the method will fail. That 
is exactly the case of resonance, when Aw? = C and B = 0. The coefficients M and 
N become 0/0. The equation becomes A (y” + w? y) = coswt. The particular yp, cannot 
be Mcoswt + Nsinwt because coswt and sinwt are null solutions yn. 
They have y” + w? y = 0. The same w is on both sides of the equation. 


Resonant solutions In case D = 0, the particular solution again has an extra factor t. 


Then put y, = Mtcoswt + Ntsinwt into equation (4) to find M = 0 and N = 1/2. 
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Summary of the Method of Undetermined Coefficients 


When the forcing term f(t) is a polynomial or a sinusoid or an exponential, look for a 
particular solution y(t) of the same form. Derivatives of polynomials are polynomials, 
derivatives of sinusoids are sinusoids, derivatives of exponentials are exponentials. Then 
all terms in Ay” + By’ + Cy = f will share the same form. 

When f(t) = sum of exponentials, look for y(t) = sum of exponentials. When f is a 
polynomial times a sinusoid or an exponential, y(t) has the same form. When a sinusoid or 
an exponential in f happens to be a null solution (resonance), include an extra t in yp. 


Question What form would you assume for y(t) when f(t) = 4e¢ + 5 cos 2t+t? 
y' 


Answer Look for y(t) = Ye’ + M cos 2t + N sin 2t + at + b. The coefficients in 
the differential equation need to be constants. Then Ay”, By’, Cy and f all look like y. 


Variation of Parameters 


Now we want to allow any forcing function f(t). The equation might even have variable 
coefficients. If we know the null solutions, the method called “variation of parameters” 
can find a particular solution. 


Suppose the null solution with f = 0 is yn(t) = cryi(t) + coye(t). We know y; and yp. 
For a particular solution when f(t) #4 0, allow c, and c2 to vary with time: 


Variation of parameters Yp(t) = c1(t)yi(t) + c2(t)y2(t) (5) 


This idea applies to any second order linear differential equation like 


ae Bb Cty = fl): (6) 


Substituting y,(t) from (5) gives a first equation for c;’ and c2’. Those are the parameters 
varying with t. To recognize a convenient second equation for ci’ and cz’, compute the 
derivative of y, by the product rule : 


Up = (cr(t)yr’ + co(t)y2’) + (er'(t)yn + c2'(t)y2)- (7) 
A good choice is to require that the second sum be zero : 
Second equation for c;',co’ cx’ (t)y1(t) + c2’(t)y2(t) = 0. (8) 
Now the second sum in (7) drops out and we compute y,” (product rule again) : 
Yp" = (er (t)yr” + ca(t)y2”) + (er'(t)yn' + €2' (t) ye’). (9) 


Put yp, Yp’, Yp” from (5), (7), (9) into the differential equation to get a wonderful 
result : 


First equation for c,', co’ ci’ (t)y1’(t) + co’ (t)ye2’(t) = F(t). (10) 
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That became simple because the null solutions y; and yp satisfy y” + B(t)y’ + C(t)y = 0. 

We now have two equations (8) and (10) for two unknowns c,'(t) and co’/(t). At each 
time t, the four coefficients P, Q, R, S in the two equations are the numbers y(t), yo(t), 
yi'(t), yo’ (t). Solve those two equations, first using P, Q, R, S: 


ek 
PS—QR 


Pc! + Qco! =) 


pee 
Rey! + Sco’! = f 


lead to a’ = = PS—QR : 


and ¢2 (11) 
When you multiply those fractions by P and Q, they cancel. When you multiply the fractions 
by Rand S and add, the result is the second equation Rc,’ + Sica’ = f(t). 

Linear equations come at the beginning of linear algebra in Chapter 4. Here we have a 
separate problem for each time t, and the solution (11) becomes (12) when P, Q, R, S are 
yi(t), yo(t), yr’ (t), yo’ (t). I will write W for PS — QR: 


=—wOFO yyy OL 


eee) we) 


W(t) = yrye’ — yeyr’ = (12) 


This denominator W(t) is the Wronskian of the two null solutions y;(t) and yo(t). 
It was introduced in Section 2.1, and the independence of y;(t) and yo(t) guarantees that 
W(t) 4 0. The divisions by W(t) in (12) are safe. The varying parameters c,(t) and 
C2(t) are the integrals of c’(t) and c2’(t) in (12). 

We have found a particular solution c,y; + c2y2 to the differential equation (6): 


If y; and yz are independent null solutions to y” + B(t)y’ + C(t)y = 0, then a 
particular solution y,(t) with right side f(t) is c1(t)yi(t) + co(t)ya(t): 


Variation of (t) = —y:(t) pe ain ae) [Be at. Wass 


Parameters W(t) W(t) 


Example 5 Variation of parameters: Find a particular solution for y’’ + y = t. 


The right side f(t) = t is not a sinusoid. No problem to find the independent solutions 
yi(t) = cost and y2(t) = sint to the null equation y” + y = 0. The Wronskian is 1: 


W(t) = y1y2’ — yoyr' = cos?t + sin?t = 1 (never zero as predicted). 


The particular solution y,(t) = ci(t) cost + co(t) sint needs integrals of cy’ and co’: 


—sint)t dt t)t dt 
a(t) = [ CARON = tcost — sine an(t) = [ A ssint + cost. 


Variation of parameters has found a particular solution cyy; + coy2, and it simplifies: 
= (t cos t — sin t) cost + (t sin t+ cos ¢) sint =t. (14) 


Apologies ! We could have seen by ourselves that y = t solves y” +y = t. And the method 
of undetermined coefficients would find y = tt much faster: no integrations. 
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Example 6 Solve y” + y = 6(t) by variation of parameters. The null solutions cost 
and sint still give W(t) = 1. The delta function f goes into the integrals for c; and c2: 


C1 = =sin0=—0O C2 = is soso = =cos0=1 


Then y,(t) = (1)yo(t) = sin t. With f = d(t), this is the fundamental solution g(t) 
(the impulse response). Then sin t is also the solution to y” +y = 0 that starts from y(0) = 0 
and y OS = 1. We will find this growth factor again in (17) with s; = —sg = 1%. 


Constant Coefficients and the Solution Formula 


The one time we are sure to know the null solutions y; and y2 is when the differential 
equation has constant coefficients. Substituting y = e* into Ay” + By’ + Cy = 0 leads 
to As? + Bs +C = 0. The roots are s; and sy. The null solutions are e*!’ and e52*, 
Notice that we are free to assume that A = 1. (If not, divide the equation by A.) 

Variation of parameters gives the solution (13). All we need is the Wronskian W(t), 
and for these null solutions it is beautiful : 


W(t) = yiy2’ — yoy’ = (e*!’)(s2e**") — (e°**)(s1e™") = (s2 — 81)e**e*?*. (15) 


Immediately we know that W(t) 4 0 unless s; = sg. With equal roots we expect to need 
the special null solution yz = te*’. Even in that case the Wronskian looks terrific : 


W(t) = (e%)(te®*)’ — (te’*)(e*")' = (e**)(ste** + e8*) — (te*) (se) = e?**. (16) 


When you substitute y; and y2 and W into (13), that “VP formula” produces y(t). 


Unequal roots s; # sz. The first integral has y2/W = e~*1"/(s2 — 51). The second 
integral has y;/W = e~*?"/(s2 — 81). Put those into (13): 


est 


t t 
Particular solution a =r es2t =asT 
Constant coefficients u(t) = 89 — $1 fe es 82 — $1 e Ce ee 
0 


To me, a growth factor g(t — T) is multiplying the inputs f(T). The integrals just sum up 
the outputs. Here is the same formula for yp(t) written so it uses g(t): 


sit Sat 


Growth factor g(t)—-—————__ Solution y,,(t) = ih g(t —T)f(T)aT (1) 
81 — 82 
0 


That might be the nicest formula in the book. Probably I am writing those words because 
I didn’t see this formula coming. Section 2.3 discovered the same response g(t) ! 
Forgive me for that personal note. I will go on to the other case, with s; = Sg. 
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Equal roots s; = s2 = s with W = e?®?. The first integral in (13) still has y, = e*¢ and 
now y2/W = te~**. The second integral has y2 = te®’ and y,/W =e~*: 


t t 

Particular solution 3 = a = 

Null solutions Sages ip) se te NE(D)ar + te” [e NS (DAT. 
0 0 


This also has a perfect form when you identify the factor g(t — T) that is multiplying f: 


t 
Growth factor g(t) = te%t Solution y,(t) = ic —T)f(T)dT (18) 
0 


Formulas that good never happen by accident, g(t) must mean something important: 
The growth factor g(t) is the impulse response: y,(t) is g(t) when f(t) is d(t). 


Let me close Section 2.6 on that high note. Then Section 2.7 will take the Laplace 
transform of the growth factors g(t) to get the transfer function Y (s) : 


efit _ ¢Sat 1 1 
The transform of g(t) = —-————— is = =Y. (s’): 
g(t) 81 — 89 (s—si)(s—s2) s?+Bs+C (s) 
Sit; 1 1 
The transform of g(t) = te*!” is = when s; = 82. 


(s— 81)? s?+Bs+C 


Y(s) comes from B and C. The solution y(t) comes from g(t) = “‘Green’s function.” 
The last pages of the book will see the integral of g(t — T) f(T) as a convolution. 


= REVIEW OF THE KEYIDEAS #® 


1. Undetermined coefficients in y, apply when f(t) has only e*’, coswt, sinwt, t”. 
. Set yp = exponential/sinusoid/polynomial. Find coefficients a, b,... to match f(t). 
. Variation of parameters: c; and cg vary with t in yp = c,(t) yi(t) + ca(t) yo(t). 


. Two equations for c;' and co’ lead to c; and c2 = integrals of —y2 f/W and y, f/W. 


n & WwW N 


. For constant coefficients c, and cz those are integrals of e~ *' f(t) and e~ *** f(t). 


6. Then yp = f g(t — s)f(s)ds when g(t) = response to the impulse f = 4(t). 
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Problem Set 2.6 


Find a particular solution by inspection (or the method of undetermined coefficients) 


1 (a) y"+y=4 (b) y”+y'/=4 (c) yl =4 

2 (ay y"+yl/ty=e (b) yl +y/+y=e% 

3 (a) y” —y=cost (b) y”+y=cos2t () y’+y=tt+et 
4 For these f(t), predict the form of y(t) with undetermined coefficients : 


isi (iia t2 (b) f(t) = cos 2t (c) f(t)=tcost 
5 Predict the form for y(t) when the right hand side is 

(@af(é)i=e* "(by fh) Ste™ (c) f(t)=e' cost 
6 For f(t) = e when is the prediction for y(t) different from Y e* ? 


Use the method of undetermined coefficients to find a solution y,(t). 


7 (a) y" + 9y =e (b) y+ 9y = te 
8 (a) y"+y'=t4+1 (b) yf! gl =P 44 
9 (a) y" pie 3yi= cos t (b) yl" AE 3y Soa 


10 (a yy" t+ y/t+yH=P 0b) y"ty’+y=0 


11 (a) y”+y'+y=cost (b) y”+y'+y=tsint 


Problems 12-14 involve resonance. Multiply the usual form of y, by f. 

12 (a) y’ +y=e% (b) y”+y=cost 

13 (a) y” —4y'+3y=e% (b) yy” —4y'+3y = e* 

14 (a) y’ -y=e (bl) y’—y=te’ (c) y'—y=e' cost 


15 Fory” + 4y = e* sin t (exponential times sinusoidal) we have two choices: 


1 (Real) Substitute yp = Me’ cos t+ Ne’ sint: determine M and N 
2 (Complex) Solve 2” + 4z = e+), Then y is the imaginary part of z. 


Use both methods to find the same y(t)—which do you prefer? 
16 (a) Which values of c give resonance for y” + 3y’ — 4y = te? 


(b) What form would you substitute for y(t) if there is no resonance ? 


(c) What form would you use when c produces resonance ? 
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17 = This is the rule for equations P(D)y = e“ with resonance P(c) = 0: 


If P(c) = 0 and P’(c) 4 0, look for a solution y, = Cte (m = 1) 
If cis a root of multiplicity m, then y, has the form 


18 (a) To solve d4y/dt* — y = t?e°*, what form do you expect for y(t) ? 


(b) If the right side becomes t? cos 5t, which 8 coefficients are to be determined ? 


19 Fory’—ay = f(t), the method of undetermined coefficients is looking for all f(t) so 
that the usual formula y, = e* [ e~°* f (s)ds is easy to integrate. Find these integrals 
for the “nice functions” f = e“, f = e*”', and f =t: 


[eccsestas [ervveiwtas [ e-**sas 


Problems 20-27 develop the method of variation of parameters. 


20 ‘Find two solutions y;, y2 to y” + 3y’ + 2y = 0. Use those in formula (13) to solve 
(aly! +:3y' + 2ySe 3) yy" +3y' +29 =e°% 

21 ~~‘ Find two solutions to y” + 4y’ = 0 and use variation of parameters for 
(a) y" + 4y! =e” (b) oy" +4y’=e% 


22 ‘Find an equation y” + By’ + Cy = 0 that is solved by y; = e* and yo = te’. 
If the right side is f(t) = 1, what solution comes from the V P formula (13)? 


23. y” — 5y’ + Gy = 0 is solved by y, = ec” and yo = e*, because s = 2 and 
s = 3come from s? — 5s + 6 = 0. Now solve y” — 5y’ + 6y = 12 in two ways: 


1. Undetermined coefficients (or inspection) 2. Variation of parameters using (13) 
The answers are different. Are the initial conditions different ? 


24 What are the initial conditions y(0) and y’(0) for the solution (13) coming from 
variation of parameters, starting from any y; and y2 ? 


25 = The equation y” = 0 is solved by y; = 1 and yo = t. Use variation of parameters to 
solve y” = t and also y” = t?. 


26 Solve y,” + ys = 1 for the step response using variation of parameters, starting from 
the null solutions y; = cos t and y2 = sin t. 


27 Solve ys" + 3ys’ + 2ys = 1 for the step response starting from the null solutions 
y, =e and y. = e~%, 


28 Solve Ay” + Cy = coswt when Aw? = C (the case of resonance). Example 4 
suggests to substitute y = Mt coswt + Ntsinwt. Find M and N. 


29 Put g(t) into the great formulas (17)-(18) to see the equations above them. 
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2.7 Laplace Transforms Y (s) and F'(s) 


If you think about the functions that have dominated this book, the list is not very long. 
They are the right hand sides of linear differential equations and also the solutions y(t) : 


1. Exponentials e% 

2. Sinusoids cos wt and sin wt 

3. Polynomials starting with 1 and t and ¢? 
4. Step functions H(t — T) 

5. Delta functions 6(t — T) 


6. Products of 1 to 5 


Why are these functions special ? I believe this is an important question. 
The answer that strikes me first is something I had not thought about: 


The derivatives and integrals of these functions are also on the list (almost). 


That was true from the very start of Chapter 1. Example 1 on page 1 was y = e*. Its 
fundamental property is dy/dt = y. The derivative leaves it unchanged, which puts it on the 
list. And the product of two exponentials is another exponential. In fact exponentials could 
be a short list by themselves. 

Cosines and sines were written separately, but those are combinations of e“”* and e~™®. 
They just move us to complex numbers. The constant polynomial is e°* = 1. Integrals and 
derivatives of polynomials are polynomials. The product rule for derivatives (and the reverse 
tule which is integration by parts) keep the list self-contained: no new functions. 


There is one flaw but it is easily fixed. The delta function 6(t) is the derivative of the 
step function H(t), but we need all derivatives and integrals. Include them on the list! 
Solving dy/dt = step function gives y(t) = ramp function. This is zero for t < 0, and 
y(t) = t fort > 0. Its graph has a corner and its slope has a jump. The integral of that 
linear ramp is a parabolic ramp. The next integral leads toward a cubic spline. The 
derivative of a delta function is a very singular object (see Problem 25). 

In the end, all these ideal functions can go on the list which is now complete. 


The Algebra of Differential Equations 


With those special functions, solving a constant coefficient linear differential equation is 
not so difficult. It reduces to an algebra problem. The null solution y,, is a combination 
of exponentials (possibly times powers of t). The particular solution y, has a known 
form like Ye*”*—the differential equation will decide the undetermined coefficient Y. 
For functions 1 to 6, the integrals using variation of parameters are already on the list. 
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The Laplace transform gives a systematic way to do the algebra. Functions of t 
become functions of s. Instead of derivatives dy/dt, we have multiplications sY(s). Then 
differential equations in t become algebra equations in s. Start with these examples : 


Left side y(t) + Y(s)| y/(t) > s8Y(s) and y(t) > s?¥(s) when y(0)= y/(0)=0 


Right side f (t) > F(s) | f 


Solving a differential equation by using the Laplace transform involves three steps : 
1 Transform every term 2 Solve for Y(s) 3 Find y(t) whose transform is Y (s). 


You will see how initial values for y(0) and y’(0) go into the s-equation for Y(s). And 
most important, you will see how the zeros of the polynomial s? + Bs + C become 
“poles” of Y(s). Those exponents s; and s2 give us the null solution y,(t). Dividing 
by that polynomial gives the transfer function 1/(s? + Bs + C). Now we see all of this as a 
natural part of the Laplace transform. 


Example 1 Start from y(0) = 0 and y’(0) = 0. With those initial conditions, the 
transform of y’ is sY and the transform of y” is s?Y. We can transform a whole equation : 


Step1 y” —4y’+ 3y = e™ transforms to (s? — 4s + 3) Y(s) = 


s—a 
; 1 1 
Step 2 The transform of y(t) is Y(s)= (oe ee = ConG=iGss) 


Step 3 The inverse Laplace transform of Y(s) is y(t) = Ci1e3* + Coe*t + Ge. 


C; and C2 come from matching the initial conditions y(0) = 0 and y’(0) = 0. The gain 
G = 1/(a? — 4a + 3) is the transfer function at s = a. The inverse transform of Y(s) is 
computed in equations (12) and (14). Step 2 revealed the poles of Y(s) : 


1 
(s — 3)(s —1)(s—a) 


Those three numbers are the all-important exponents in y(t) = Cie** + Coe’ + Ge. 
Now they are seen as the poles 3, 1, a where Y (s) becomes infinite. 


has poles at s = 3 and s = 1 and s =a. 


Example 2 Change from f = e“ to f = 6(t) = impulse. Keep y(0) = y’(0) = 0. 
Step1 y"” + By’+ Cy = 6(t) transformsto (s? + Bs + C) Y(s) =1. 


1 
Step 2 The transform of y(t) is Y(s) = ———————- = transfer function. 
s?+Bs+C 
esit 425 eS2t 
Step 3 The inverse transform is y(¢) = g(t) = ——————- = impulse response. 


Si — 82 
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Those roots s1, 82 of s* + Bs + C = (s — 81)(8 — 82) give poles in Y(s) and exponentials 
in y(t). You have to be impressed by how quickly steps 1-2-3 led to this central fact. 


When f = 6(t), the transform of the impulse response g is the transfer function Y. 


The Laplace Transform 


Our first Table of Transforms will include the most essential functions and no more. A 
more complete presentation of this transform will be saved for the last sections of the book. 
We will define Y(s) here, but the shift rule for transforms will be developed there. All 
step functions H(t — T) are left for Chapter 8, except for one comment below. 

Especially we point to the final Section 8.6 on “convolutions”. These are the inverse 
transforms of products Y(s) = F(s)G(s). Convolution is exactly what we need when 
f(t) is not a simple function like e and F'(s) is not a simple function like 1/(s — a). 

To create the Table of Transforms we start with the integral that defines F'(s) : 


co 
The Laplace transform of f(t) is F(s) = [te e 8 dt. (1) 
0 


The first function to transform is certainly f(t) = e®*. Then F'(s) = 1/(s — a) as expected: 


vs (a—s)t eee) 1 a) 
F(s) = ae dt = |S =) = ; (2) 
4 t=0 a 


Gz Ss. = Ss s§—-a@a 


That integral would be infinite if a > s. It is typical of Laplace transforms to require s > a. 
Then the factor e~** in the integral brings us safely to zero at t = co. The following rule 
is natural for all functions f(t), when you look at the integral (1) from t = 0 to t = co: 


By definition f(¢) = 0 forall t <0. Functions don’t start until t = 0. 


Then the step function H(t) and the constant function f = 1 have the same transform! 


co 
1 
The transform of f(t) =1 is F(s) = pied dt = -. (3) 
8 
0 


This is the transform of e*’ when the exponent a goes to 0 and 1/(s — a) goes to 1/s. 
Transform of the Derivative 


Now comes the most important rule—the whole basis for solving differential equations. 
If the transform of y(t) is Y(s), what is the transform of the derivative dy/dt ? 


Derivative Rule The transform of dy/dt is sY(s) — y(0). 
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The derivative rule shows how the initial conditions enter the transformed problem— 
not as separate side conditions, but directly into the equation for Y(s). The proof uses 
integration by parts. The integral of dy/dt is y(t) and the derivative of e~*' is —se~*: 


a edt = - f wley(-se-* dt + [y(tje**] 
0 0 
Transf 
of dy/dt = s¥(s)—y(0) (4) 


Again s must be large enough—or more exactly, the real part of s must be large enough— 
to assure that y(t)e—** drops to zero at t = 00. 

We can immediately solve the model problem of Chapter 1: A first order linear 
equation. The solution steps 1, 2, 3 produce Y(s) with poles (blowup values for s) at 
the two key exponents s = a ands = c: 


dy 
Example 3 Solve — — ay = e™ starting fromany y(0). 


dt 
1 
Step 1 Transform the equation to sY(s) — y(0) — aY(s) = ae (5) 
Les y(0) 1 
Step2 (s—a)Y(s) = y(0) + —— gives Y(s) = + ‘ (6) 
s—C s—a_ (s—a)(s—c) 
0 
Step 3 The inverse transform of y(0) is the null solution y,(t) = y(0)e%, (7) 
s—a 
1 (4 en eat 
The inverse transform of ——————~ is the very particular solution —————— (8) 
(s —a)(s —c) c—a 


I have to say, this is beautiful. The effort we made in Chapter 1 has been reduced to its 
bare minimum. All that is left is the derivative rule, the transform of exponentials, and “par- 
tial fractions.’ Those partial fractions were the algebra from Step 2 to Step 3: 
separating 1/(s — a)(s — c) with two poles a and c into two fractions with one pole each. 


(9) 


PF2 was used in Example 2 to find the impulse response. In that case a and c were s1 
and s9. Partial fractions were also used in Example 1, with f = e% and three poles 3, 1 a. 
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Partial Fractions 


Example 1 reached Y(s) = 1/(s + 3)(s + 1)(s — a). We didn’t immediately know 
its inverse transform y(t). But finding y(t) becomes simple when Y(s) is separated into 
three terms with one pole each. Those three pieces are the Partial Fractions in PF3: 
1 _ 1 oY 1 ie 1 
(s—3)(s—1)(s—a)  (s—3)(3-1(3-@ | (1-3)(s—1)(1—a)  (@—3)(a~1)(s—a) 
Usually I would show you where this PF3 formula comes from. In this case I would rather 
show you that it is correct. Above all, you must see the main point: The three separate terms 
with one pole each lead immediately to the three parts Cye3* and Cze’ and Ye. 
Officially, correctness can be proved by multiplying PF3 by (s — 3)(s — 1)(s — a). 


~ (s—1)(s—a)  (s—3)(s—a) (s —3)(s —1) 
“GES NGae. WM=sl=a) @= aaa) Op) 


At s = 3, the last two terms disappear and we have 1 = 1 (as desired). At s = 1, 
the second term equals 1. At s = a, the third term equals 1. Thus (10) is an equation 
of the form 1 = As? + Bs + C, and the equation is correct at three values s = 3,1,a. 
Therefore the equation must be always correct, and PF3 is shown to be true. 


Remark The theory of partial fractions usually computes C) and C’2 and Y so that 


1 Cy C2 Y 


GH=)G=6=a) 623 s=f sa a 


The idea is to put the right side over a common denominator, which is on the left side. 
Matching the coefficients of s? and s and 1 gives three equations for C, and C2 and Y. 
My shortcut was to go directly to the answers C1, C2, G that you see in PF3: 


1 i: 1 
= SSS CC. = ————_ Y = ————_—_.. 12 
GaGa 2 (Gata) Cea 
I think it is easier to remember this pattern than to solve for a new C and C2 and Y, 
every time you change the poles 3 and 1 and a. To repeat, from the three partial 
fractions in PF3 we read off the coefficients C,,C2,Y in equation (12). 


Very Particular Solution 


Look at what we have in those three parts. The last part Ye is a particular solution— 
the one that comes from the transfer function and the exponential response formula. 
The equation was y”” — 4y’ + 3y = e%, and the response to e% is 


1 1 
of = (13) 


Se 
yp(t) : a2?—dat+3— (a — 3)(a—1) 
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That is old news. This is not the very particular solution, it doesn’t start at y(0) = 0 and 
y'(0) = 0. The solution with that particular start is the one from the Laplace transform: 


The very particular solution is all of yyp(t) = Cie?* + Coe* + Ye. (14) 


Remember, any null solution y,, can be added to one particular y,. That gives another yp. 
The very particular solution y,,, starts from rest. 

The complete solution adjusts the free constants c; and cz (note the small c) to match 
any starting values y(0) and y’(0) : 


Ycomplete = ce + ce’ + Ye™. (15) 


You could solve for c; and cz as usual, by setting ¢ = 0 in y and y’. Then you are working 
in the time domain. Or you could use y(0) and y’(0) in finding Y(s), when you trans- 
form the equation in the first place. Let me show you that way, compared to the usual way. 


Including y(0) and y’(0) in the Transform 


We know that the transform of y’ is sY(s) — y(0). To find the transform of y’’, use that 
first derivative rule twice. This brings in y’(0) along with y(0). 


transform of y/” = s(transform of y’) — y'(0) 
= s(sY(s) — y(0)) —y'(0) 
= s*Y(s) — sy(0) — y’(0). (16) 
Now we can solve the equation y” — 4y’ + 3y = e% entirely by Laplace transform: 


Step 1 Transformto (s?¥(s) — sy(0) — y’(0)) — 4(sY(s) — y(0)) + 8Y(s) = — 


Step 2 Rewrite as (s* — 4s + 3)Y(s) = (s — 4)y(0) + y/(0) +1/(s — a). 


Solve forY(s):  Y(s) = (s =u a + EES (17) 


Step 3 Invert both pieces of Y(s) to find yn(t) + yp(t). 


This looks more painful to me! The last part of Y(s) is fine—that is what we already 
worked with to find yp. Its inverse transform is the very particular solution in (14). The 
first part of Y(s) involves y(0) and y’(0). We have to do partial fractions again: not good. 

The denominator s? — 4s + 3 has two factors (s — 3)(s — 1) and not three factors. 
But I would prefer to find c; and cy in the complete solution (15), by setting t = 0 
and solving these two equations : 


ci toat+ Y=y(0) 


18 
3c, +c@+aY =y'(0) Os) 


When y(0) and y’(0) are zero, that’s when c; and cz and y equal C) and C2 and yup. 
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Transforms at Resonance 


The reader will remember that when two exponents come together, and two solutions 
become one solution like e*’, another solution is born. It is like atomic fission or fusion. 
The new solution has the form te®*. We want to find its Laplace transform. 

Equal exponents can happen in two different ways for y” + By’ + Cy = f(t). 


1 (Null solution) Two roots s; and s2 of the characteristic polynomial become equal. 
2 (Particular solution) The exponent in f = e% equals s; or se in the null solution. 


In a truly extreme case we might have s; = sg = a, three equal exponents. Then the 
null solution is c,e®* + cgte®, and a particular solution is Gt?e%. 

We are seeing these possibilities in the “time domain” and we can see them in the 
“frequency domain’. Double roots in the t-domain become double poles in Y (s). 


1 
( 2 with a double pole. (19) 
s—a 


The Laplace transform of te® is 


A nice proof starts with a simple pole in the transform. The transform of e is 1/(s — a). 
Now take derivatives of both sides with respect to a: 


foe) 


, 1 1 1 1 
[este-*tat = [ teste-*at eS = 2 
s—a da \s—a (s — a) 
0 0 


If we take another a-derivative, the transform of t?e% is seen as 2(s — a)—3 with a 
triple pole. The simplest example of this extreme case would be the equation y’” = 2. 


y” = 2 has exponents 0 and 0 in y,(t) =¢; + cot and a=Oin y,(t) = t?e™ = ¢?. 
The initial conditions give c; = y(0) and cz = y’(0). The solution is easy to check: 
y = y(0) + ty’(0) +t? solves y” =2. (20) 


To find this solution by Laplace transform, start by transforming y” and 2: 


2 2 
s’Y(s) — y(0)s —y/(0) = = gives Y(s) = ——~+ + < (21) 
The inverse transforms of 1/s and 1/s? are 1 and ¢. The inverse transform of 2/s° is t?. 
So the inverse transform of Y (s) is the correct y = y(0) + ty’(0) + ¢? in (20). 
Those are really e° and te and t?e™ : three zero exponents, a truly extreme case. 


The inverse of equation (19) tells us the fundamental solution g(t) when the transfer 
function 1/(s? + Bs + C) has a double pole and s? + Bs + C = 0 has s; = 82: 


If s? + Bs + C = (s — s,)* then the fundamental solution is g(t) = teS1l. 
g 
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The Transforms of cos wt and sin wt 


In all of this section on Laplace transforms, there is no requirement that a must be real. That 
exponent can be tw or —2w or any complex number a + tw. ae the identity cos wt = 
(et + e~™*), and from the linearity of the formula for F'(s) = { f(t)e~*tdt, we can 
combine the known transforms of e*”? and e~** : 


: 1 1 
The transform of f(t) = cos wtis F'(s) = are + a =) = 


s* + w? 


De eat ae Ss, 
The twin identity sin wt = ae — e~**) also comes from Euler’s formula. 
i 


1 1 
The transform of f(t) = sin wtis F(s) = (= 7 3) ap 


% st? +w2 


S—Ww s+iw 


Those transforms appear in the fundamental example of a mass hanging from a spring : 

Step1 my’ +ky = cos wt transforms to m(s*Y(s)—sy(0)—y'(0)) +kY(s) = por 
s?+w 

The transform Y(s) is multiplied by ms? + k. The transfer function is 1/(ms? + k). 


The transfer function multiplies the input to give the output. The input is on the 
right hand side, the output is the solution. Both of those are now in transform space! 


1 
Step2 Solve for Y(s) = eC aT (sv + y/(0) + aaa] : (24) 


We are ready for Step 3, but it doesn’t look so easy. It requires the inverse transform of this 
Y(s). Our simple mass-spring problem has led us to a fourth degree denominator (ms? + 
k)(s? + w). We need partial fractions to separate Y(s) into two pieces with 
second degree denominators. That algebra is not so bad, and it can be left for Problem 
26. 


The result is that y(t) has a term in cos wt and another term in cos w,t. The driv- 
ing frequency is w, the natural frequency w,, = \/k/m comes from the zeros of ms? + k. 
The frequencies in the solution y(t) are the poles tiw and +iw,, in its transform Y (s). 


That bold statement is really the important message from a Laplace transform. We 
engineer the system or the network by moving those poles. Often we keep them well 
separated to avoid instability. And we add damping to push the zeros of ms? + bs + k 
(poles of Y(s)) off the imaginary axis and into the stable left halfplane where Re s < 0. 


e% tert, t2 eat cos wt, sin wt 


Y, sa y(0), 


2.7. Laplace Transforms Y (s) and F(s) 147 


Complex Roots a + iw 


Finally we come to the most typical case for physical systems. It has damping, and it has 
oscillation. The roots of s* + 2s + 5 are complex. Their real parts area = —2/2 = —1. 
Their imaginary parts +B? —4AC/2 are +iw = + /—16/2 = +2i. We are in 
the underdamped case and the solutions to y’” + 2y’ + 5y = 0 can be written two ways: 


y= ce(-1 + 21)t + cge(—} — 2t)t op y= e (Gj cos 2t + C2si n 2t). (25) 
What does this problem look like in the s-domain, after a Laplace transform ? 
y" +2y'+5y=0 transformsto (s* + 2s +5)¥(s) —(s+2)y(0) —y/(0) =0. (26) 


That quadratic s? + 2s + 5 will go into the denominator of Y(s), as always. This part of 
Y(s) is the transfer function 1/(s? + 2s + 5). The numerator is (s + 2)y(0) + y’(0) 
from the initial conditions. The right hand side of our null equation (26) is zero and the 
transfer function is connecting the inputs y(0) and y’(0) to the solution : 


(s +2) y(0) + y'(0) 
Ys ————— — 
The transform of y(t) is Y(s) Booger 


(27) 

This is the point where partial fractions can enter, if we choose. We can separate 
s* + 2s +5 into its linear factors (s — s1)(s — 82). I suggest not to do it. Those roots 
$8, and sg are complex numbers, and it is easier to stay with one real quadratic. 


We are close to the transforms of cos wt and si nwt, already in the Table above. The 
new factor is e*’ = e~* from the real part, and it gives decay. 


s8—a 
5+ (28) 


e* cos wt and e® sin wt transform to ————————_ and. ——___—__ 
(s — a)? + w? (s—a)?+w 


For (27), the key is to separate s* + 2s + 5 into (s + 1)? + 4. From this we recognize 
a = —l1 and w = 2 as expected. Then the inverse transform combines e~' cos 2¢ and 
e~ si n2¢. The numerator in (27) is linear, call it Hs+K. To fit perfectly with the numerator 
S — ain (28), we can split any Hs+ K into H(s—a)+(K + da): 


As+kK 


oat Sl nwt 
(s—a)*+w? 


The inverse transform of is He cos wt + (K + Ha) (29) 


For higher order equations, and for equations with exponential driving functions f(t), 
the transform Y(s) involves polynomials of higher degree. In principle, partial fractions 
can reduce to degree 1 and degree 2. Those produce the real poles and complex poles 
of Y(s)—the real and complex exponentials e* in y(t). I would certainly turn first to the 
method of undetermined coefficients in Section 2.6. 

The best contribution of Laplace transforms is to focus attention on transfer functions 
like 1/(As? + Bs + C) and their poles. 
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= 


an —&_ WwW WN 


. The Laplace transform of f(t) is F(s) = f f(te7*dt. f =e > F= +. 
0 


= REVIEW OF THE KEYIDEAS #® 


s—a’ 


. Ay” + By’ + Cy transforms to (As? + Bs + C) Y(s) — (As + B)y(0) — Ay’(0). 
. Step 1 transforms the equation, Step 2 solves for Y (s), Step 3 inverts Y(s) to y(t). 
. The exponents in the solutions y,,(t) and y,(t) are the poles in Y(s). 


. Partial fractions can simplify Y(s) using PF2 and PF3, to help invert to y(t). 


Problem Set 2.7 


Take the Laplace transform of each term in these equations and solve for Y(s), 
with y(0) = 0 and y’(0) = 1. Find the roots s; and sz — the poles of Y(s) : 


Undamped y” + Oy + 16y = 0 
Underdamped y+ 2y' + 16y = 0 
Critically damped y’ + 8y' + 16y = 0 
Overdamped y” +10y' + 16y = 0 


For the overdamped case use PF2 to write Y(s) = A/(s — 51) + B/(s — 82). 
Invert the four transforms Y (s) in Problem | to find y(t). 


(a) Find the Laplace transform Y (s) from the equation y’ = e® with y(0) = A. 
(b) Use PF2 to break Y(s) into two fractions C;/(s — a) + Co/s. 
(c) Invert Y(s) to find y(t) and check that y’ = e and y(0) = A. 


(a) Find the transform Y(s) when y” = e with y(0) = A and y'(0) = B. 
(b) Split Y(s) into C,/(s — a) + C2/(s — a)? + C3/s. 
(c) Invert Y(s) to find y(t). Check y” = e“ and y(0) = A and y’(0) = B. 


Transform these differential equations to find Y(s) : 


(a) y” — y’ = 1 with y(0) = 4 and y’(0) = 0 
(b) y” + y =cos wt with y(0) = y’(0) =0 
(c) y” + y = cost with y(0) = y/(0) = 0. What changed forw = 1? 


Find the Laplace transforms F), F2, F3 of these functions f1, fo, fz: 


fi(t) = e* — e** fo(t) =e* +e" f3(t) =tcos t 
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7 For any real or complex a, the transform of f = te is . By writing 
cos wt as (e! + e~*)/2, transform g(t) = tcos wt and h(t) = te’cos wt. 
(Notice that the transform of h is new.) 


8 Invert the transforms F\, F2, F3 using PF2 and PF3 to discover fi, fo, fs: 


1 8 1 
Fi(s) == Be) rs F3(s) = 
(8) (s — a)(s — b) 2(8) (s — a)(s — b) 3(s) sigs 
9 Step 1 transforms these equations and initial conditions. Step 2 solves for Y(s). 


Step 3 inverts to find y(t) : 
(a) y’—ay=twith y(0) =0 
(b) y” +a*y = 1 with y(0) = 1 and y’/(0) = 2 
(Cc) y? +3y' + 2y = 1 with yO) = 4 and y/(0) =5. 
What particular solution y, to (c) comes from using “undetermined coefficients” ? 
Questions 10-16 are about partial fractions. 


10 Show that PF2 in equation (9) is correct. Multiply both sides by (s — a)(s — b): 


(*) — + 


(a) What do those two fractions in (*) equal at the points s = a and s = b? 
(b) The equation (*) is correct at those two points a and b. It is the equation of 
a straight . So why is it correct for every s? 


11. Here is the PF2 formula with numerators. Formula (*) had K = 1 and H = 0: 


; HstK  _ Hat K Hb+K 
ie G20G=h)- CoG b) G=s1G20 


To show that PF2’ is correct, multiply both sides by (s — a)(s — b). You are left 
with the equation of a straight . Check your equation at s = a and at s = b. 
Now it must be correct for all s, and PF2’ is proved. 


12 Break these functions into two partial fractions using PF2 and PF2’ : 


a 8 Hs+k 


pa ae ae £6 


13 Find the integrals of (a)(b)(c) in Problem 12 by integrating each partial fraction. The 
integrals of C/(s — a) and D/(s — b) are logarithms. 


14 Extend PF3 to PF3’ in the same way that PF2 extended to PF2’: 


PE? Gs?+Hst+K — Ga?+Hat+K eae 
(s—a)(s—b)(s—c) (s—a)(a—b)(a—c) | 2? 2’ 
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15 


16 


The linear polynomial (s — b)/(a — b) equals 1 at s = aand Oat s = b. Write downa 
quadratic polynomial that equals 1 at s = a and0 at s = bands =. 


What is the number C so that Cs — b)(s — c)(s — d) equals 1 ats =a? 


Note A complete theory of partial fractions must allow double roots (when b = a). The 
formula can be discovered from |’HO6pital’s Rule (in PF3 for example) when 
b approaches a. Multiple roots lose the beauty of PF3 and PF3’—we are happy 
to stay with simple roots a, b, c. 


Questions 17-21 involve the transform F'(s) = 1 of the delta function f(t) = d(t). 


17 


18 


19 


20 


21 


oo 
Find F(s) from its definition [ f(t)e~*'dt when f(t) = 6(¢ —T), T > 0. 
0 


Transform y” — 2y’ + y = 6(t). The impulse response y(t) transforms into Y(s) = 
transfer function. The double root s; = s2 = 1 gives a double pole and a new y(t). 


Find the inverse transforms y(t) of these transfer functions Y(s) : 


: On Ome 


s—a s* — a? 


(a) 


s2 — qa? 


Solve y” + y = 6(t) by Laplace transform, with y(0) = y’(0) = 0. If you found 
y(t) = sint as I did, this involves a serious mystery: That sine solves y” + y = 0, 
and it doesn’t have y’(0) = 0. Where does 6(t) come from? In other words, what is 
the derivative of y’ = cos t if all functions are zero fort < 0? 


If y =sint, explain why y” = — sint + 6(¢). Remember that y = 0 fort < 0. 


Problem (20) connects to a remarkable fact. The same impulse response y = g(t) 
solves both of these equations: An impulse at t = 0 makes the velocity y’(0) 
jump by 1. Both equations start from y(0) = 0. 


y” + By’ + Cy = 6(t) with y’(0)=0 y” + By’ + Cy = 0 with y’(0) = 1. 


(Similar mystery) These two problems give the same Y (s) = s/(s? +1) and the same 
impulse response y(t) = g(t) = cos t. How can this be ? 


y = —sint with y(0) =1 y’ = —sint + 6(t) with “y(0) = 0” 


Problems 22-24 involve the Laplace transform of the integral of y(t). 


22 


t 

If f(t) transforms to F'(s), what is the transform of the integral h(t) = [ f(T)dT? 
0 

Answer by transforming the equation dh/dt = f(t) with h(0) = 0. 
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23 


24 


25 


26 


27 


t 
Transform and solve the integro-differential equation y’ + fydt = 1, y(0) = 0. 
0 


t 
A mystery like Problem 20: y = cos t seems to solve y’ + [ ydt = 0,y(0) = 1. 
0 


t 
Transform and solve the amazing equation dy/dt + [ ydt = 6(t). 
0 


” 


The derivative of the delta function is not easy to imagine—it is called a “doublet 
because it jumps up to +00 and back down to —oo. Find the Laplace transform of the 
doublet dé /dt from the rule for the transform of a derivative. 


A doublet 5’(¢) is known by its integral: [ 5/(t)F(t)dt = —f 6(t)F’(t)dt = —F'’(0). 


(Challenge) What function y(t) has the transform Y(s) = 1/(s? + w?)(s? + a?) ? 
First use partial fractions to find H and Kk: 


A Kk 


Y(s) = ae 
(s) Sa ee ry’ 


Why is the Laplace transform of a unit step function H(t) the same as the Laplace 
transform of a constant function f(t) = 1? 


This Page Intentionally Left Blank 


Chapter 3 


Graphical and Numerical Methods 


The world of differential equations is large (very large). This page aims to see what is already 
done and what remains to do. 

Chapters 1 and 2 concentrated on equations we can solve. Compared to digging for 
coal or drilling for oil, this was the equivalent of picking up gold. Solutions were wait- 
ing for us. Looking back honestly, we just wrote them down (not so easy in Chapter 2). 


Above all I am thinking of e® in Chapter 1 and e** in Chapter 2 and e*ta coming 
in Chapter 6 (with eigenvalues and eigenvectors). When the equation is linear, and 
its coefficients are constant, then its solutions are exponentials. 


Chapter 1 First order equations (linear or separable or exact or special) 

Chapter 2 Second order equations Ay” + By’ + Cy = f(t) 

Chapter 6 First order systems y’ = Ay + f(t) with matrices A and vectors y. 

Chapter 3 will be different. Instead of f(t) we have f(t, y). Most nonlinear problems 
don’t allow a formula for y(t). “A solution exists but it has no formula.” This is the 


hard reality of differential equations y’ = f(t, y). The equations are important but they 
don’t have exponential answers. This chapter pictures the solution, computes the solution, 


and decides if the solution is stable. 
Section 3.1 Pictures for nonlinear equations y’ = f(t, y): Stability decided by Of /Oy. 
Section 3.2 Pictures for linear second order equations and 2 by 2 systems: Stable or not. 
Section 3.3 Test for stability at critical points by linearizing systems of equations. 
Section 3.4 Euler methods (safe but slow) for computing approximations to y. 
Section 3.5 Fast and accurate computations, by methods more efficient than Euler. 
Science and engineering and finance constantly use Runge-Kutta. 
After this chapter, the book will move into high dimensions: the world of linear algebra. 
One particle and one resistor and one spring and one of anything: that was only a start. The 


reality is a network of connections: a brain, a living body, a modern machine, a web of 
processors. Every network leads to a matrix. You will learn how to read a matrix. 


In my opinion, linear algebra is pure gold. 
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3.1. Nonlinear Equations y’ = f(t, y) 


This section aims to get a picture of y(t), not a formula. The pictures will be graphs in 
the t — y plane (¢ across and y(t) up). The differential equation is dy/dt = f(t, y) 
and everything depends on that function f. I can start with a linear equation y’ = 2y. 


The solutions to y’ = 2y are y(t) = Ce?*. For every number C this gives a solution 
curve from t = —oo to t = oo. Those curves cover every point in the t — y plane. 
This is the “solution picture” we want for nonlinear equations y’ = f(t, y). 


That solution y = Ce? has a graph. The plane is filled with those graphs. Every point 
t, y has one of those curves going through it (choose the right C). A different equation 
y’ = sinty won’t have a formula. Its picture starts with just this one fact : 


dy/dt = sinty The solution curve through the point t, y has the slope sin ty. 


From that point picture we have to build a curve picture. This section tries to connect 
small arrows at points into solution curves through those points. The arrow at the point 
t, y has the right slope f(t, y). Connecting with other arrows is the hard part. 

I will separate this section into facts about y(t) and pictures of y(t). 


Facts About y(t) 


The facts will be answers to these questions, and the Chapter 3 Notes add more: 


1. Starting from y(0) at t = 0, does dy/dt = f(t, y) have a solution ? 


2. Could there be two or more solutions that start from the same y(0) ? 


Question 1 is about existence of y(t). Is there a solution curve through t=0, y=y(O0)? 
Question 2 is about uniqueness of y(t). Could two solution curves go through one point ? 
When f(t,y) is reasonable, we expect exactly one curve through every point t,y: 
existence and also uniqueness. Which functions are reasonable? Here are answers: 
1. A solution exists if f(t, y) is a continuous function for ¢ near 0 and y near y(0). 
2. There can’t be two solutions with the same y(0) when Of /Oy is also continuous. 
The word “continuous” has a precise technical meaning. Let me be imprecise and 


nontechnical. Continuity at a point rules out jumps and infinities in a small neighborhood 
of that point. The particular function f = y/t is certainly ruled out at points where t = 0: 


d 
= = 3 with y(0) = 0 has infinitely many solutions y = Ct. 


The particular function f = t/y is also ruled out when y(0) = 0 (no division by 0): 


d t 
= ah; with y(0) =0 has two solutions y(t) = t and y(t) = —#. 
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In those examples, y/t and t/y are starting from 0/0. Solutions do exist (that fact 
wasn’t guaranteed). Solutions are not unique (no surprise). We ask more from f(t, y). 
There is one important point that we emphasize here, because it could easily be missed. 


(6) 
Continuity of f and = at all points does not guarantee that solutions reach t = oo. 
y 


Yes, there will be a solution starting from y(0). That solution will be unique. But y(t) could 
blow up at some finite time ¢t. The first nonlinear equation in the book (Section 1.1) was an 
example of early explosion: 

Blow-up att = 1 The solution to ae with y(0) = Lis y(t) = qo 
That function f = y? is certainly continuous. Its derivative Of /Oy = 2y is also continuous. 
But the derivative 2y grows when the solution grows. To be sure there is no explosion at a 
finite time t, we ask for an upper bound L on the continuous function Of /Oy: 


(6) 
If Is | < L for all t and y there is a unique solution through y(0) reaching all t. 
y 


For a linear differential equation y’ = a(t)y + q(t), the derivative Of /Oy of the right hand 
side is just a(t). Then if |a(t)| < DL and q(t) is continuous for all time, solution curves go 
from t = —oo to t = oo. Chapter | found a formula for y(t) in this linear case. 

I will end with one final nonlinear fact. The condition |Of /Oy| < L is pushed to its limit 
when Of /Oy = L exactly. Then y’ = Ly + q(t). A comparison with this linear equation 
gives information about the nonlinear equation, when |Of /Oy| < L: 


If y’ =f(t,y) and z’= f(t,z), then |y(t) — z(t)| < e”*|y(0) — z(0)}. (1) 


If y(t) and z(t) start very close, they stay close. This is the opposite of what you see on 
the cover of this book. The cover shows a famous example of chaos: solutions go wild. 
A slight change in y(0) will send the solution on a completely different (and distant) path. 
We now know that Pluto’s orbit is chaotic: very very unpredictable. The equations allow it, 
because they don’t have |Of/Oy| < L. Pluto is not a planet. 


Pictures of the Solution 


Example1 dy/dt=2—y _ Solutiony(t)=2+Ce* y(o)=2 

The perfect picture of y’ = 2 — y would show a small arrow at every point t,y. The 

arrow would have slope s = 2 — y. Along the all-important “steady state line” y = 2, 

this slope would be zero. The arrows are flat (s = 0) along that line: a constant solution. 
Above that steady line, the slope 2—y is negative. The vectors have components dt across 

and dy = (2 — y)dt down. We don’t have space for an arrow at every point, 

but Figure 3.1 gives the idea. MATLAB calls the field of arrows a “quiver”. 
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PANTING HPT PT TITTY 
Figure 3.1: (a) Arrows with slopes f(t, y) show the direction of the solution curves y(t). 
(b) Along an isocline f(t, y) = s, all arrows have the same slope s. Here s = 2 — y. 


Notice that all arrows point toward the line y = 2. That steady state solution is stable. 
The formula y(t) = 2 + Ce~* confirms that the solutions approach y = 2. 


First key idea: The solution curves y(t) = 2+ Ce~® are tangent to the arrows. 
Tangent means: The curves have the same slope s = 2 — y as the arrows! The curves 
solve the equation, the equation specifies the slopes, the arrows have correct slopes. 
Second key idea: Put your arrows along isoclines. An isocline (meaning “same slope’’) 
is a curve f(t,y) = constant. This idea makes the arrows much easier to draw. All the 
isoclines 2 — y = s are horizontal lines for this equation y’ = 2 — y. When the differential 
equation is dy/dt = f(t, y), each choice of slope s produces an isocline f(t, y) = s. 

In our example, those isoclines 2—y = s are flat because f(t, y) = 2—y does not depend 


on ¢ (autonomous equation). I start the picture by drawing a few isoclines. 
I always draw the isocline f(t, y) = 0 (here 2 — y = 0 is the steady state line y = 2). 
For this equation, that “nullcline” or “zerocline” with s = 0 is also a solution curve. 


The arrows have slope zero when y = 2, so they point along the flat line. 
How to understand these pictures ? The arrows are pointing along the solution curves. 
The curves cross over isoclines. But they don’t cross over the zero isocline y = 2. 
All arrows are pointing toward the line y = 2. Those arrows will eventually take us 
across every other isocline. The pictures say that the solution curves y(t) are asymptotic to 
that line y = 2. For this equation dy/dt = 2 — y we know the solutions y = 2 + Ce~‘. 


Figure 3.2: Solution curves (tangent to arrows) go through isoclines: y’ = 2 — y. 
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dy 


Example2 —=y—y? _ Solutions y(t) = y(t) + 1or—oo 


1 
dt 1+ Ce-t 

The slope of every small arrow is y — y?. In the range 0 < y < 1, y will be larger 
than y?. The arrows have positive slope y — y? in this range (small slope near y = 0, 
small slope near y = 1, all up and to the right). The other two ranges are above y = 1 
and below y = 0. There the slopes y — y” are negative—arrows go down and right. 
The solution curves are steep when y is large, because y? >> y. 

Figure 3.3 shows the isoclines f(t, y) = y — y* = s = constant. Again f does not de- 
pend ont! The equation is autonomous, the isoclines are flat lines. There are two zeroclines 
y =1 and y=O (where dy/dt = 0 and y is constant). Those arrows have 
zero slope and the graph of y(t) runs along each zerocline: a steady state. 

The question is about all the other solution curves: What do they do? We happen to 
have a formula for y(t), but the point is that we don’t need it. Figure 3.3 shows the three 
possibilities for the solution curves to the logistic equation y’ = y — y? : 


1. Curves above y = 1 go from +oo down toward the line y = 1 (dropin curves) 
2. Curves between y = 0 and y = 1 go up toward that line y= 1 (.S-curves) 
3. Curves below y = 0 go down (fast) toward y= —oo (dropoff curves). 


The solution curves go across all isoclines except the two zeroclines where y — y? = 0. 


PLS Svs dd ddd dd dd dddd dd ddd ddd) athe 
AAA 


Figure 3.3: The arrows form a “direction field”. Isoclines y — y? = s attract or repel. 


You see the S-curves between 0 and 1. The arrows are flat as they leave y = 0, steepest 
aty = $3 flat again as they approach y = 1. The dropoff curves are below y = 0. 
Those arrows get very steep and the curves never reach t = co: y = 1/(1 — e~*) gives 
1/0 = minus infinity when t = 0. That dropoff curve never gets out of the third quadrant. 
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Important Solution curves have a special feature for autonomous equations y’ = f(y). 
Suppose the curve y(t) is shifted right or left to the curve Y(t) = y(t + C). Then Y(t) 
solves the same equation Y ’ = f(Y )—both sides are just shifted in the same way. 

Conclusion: The solution curves for autonomous equations y’ = f(y) just shift along 
with no change in shape. You can also see this by integrating dy/ f(y) = dt (separable equa- 
tion). The right side integrates to t + C. We get all solutions by allowing all C. 

In the logistic example, all S-curves and dropin curves and dropoff curves come from 
shifting one S-curve and one dropin curve and one dropoff curve. 


Solution Curves Don’t Meet 


Is there a solution curve through every point (t, y)? Could two solution curves meet at 
that point? Could a solution curve suddenly end at a point? These “picture questions” 
are already answered by the facts. 

At the start of this section, the functions f and Of /Oy were required to be continuous 
near t = 0, y = y(0). Then there is a unique solution to y’ = f(t, y) with that start. In 
the picture this means: There is exactly one solution curve going through the point. The 
curve doesn’t stop. By requiring f and Of /Oy to be continuous at and near all points, we 
guarantee one non-stopping solution curve through every point. 

Example 3 will fail! The solution curves for dy/dt = —t/y are half-circles and not 
whole circles. They start and stop and meet on the line y = O (where f = —t/y is not 
continuous). Exactly one semicircular curve passes through every point with y 4 0. 


Example 3 dy/dt = —t/y is separable. Then ydy = —tdt leads to y? + t? =C. 


Start again with pictures. The isocline f(t,y) = —t/y = s is the line y = (—1/s)t. 
All those isoclines go through (0,0) which is a very singular point. In this example the 
direction arrows with slope s are perpendicular to the isoclines with slope dy/dt = —1/s. 

The isoclines are rays out from (0,0). The arrow directions are perpendicular to 
those rays and tangent to the solution curves. The curves are half-circles y? + t? = C. 
(There is another half-circle on the opposite side of the axis. So two solutions start from 
y = 0 at time —T and go forward to y = 0 at time T’.) The solution curves stop at y = 0, 
where the function f = -—-t/y loses its continuity and the solution loses its life. 


Figure 3.4: For y’ = —t/y the isoclines are rays. The solution curves are half-circles. 
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Example 4 y’ = 1+¢-— yis linear but not separable. The isoclines trap the solution. 


Trapping between isoclines is a neat part of the picture. It is based on the arrows. 
All arrows go one way across an isocline, so all solution curves go that way. Solutions 
that cross the isocline can’t cross back. The zero isocline f(t, y) = 1+ t— y = 0 in Fig- 
ure 3.5 is the line y = ¢+ 1. Along that isocline the arrows have slope 0. The solution curves 
must cross from above to below. 

The central isocline 1 + t — y = 1 in Figure 3.5 is the 45° line y = t. This solves 
the differential equation! The arrow directions are exactly along the line: slope s = 1. 
Other solution curves could never touch this one. 

The picture shows solution curves in a “lobster trap” between the lines: the curves 
can’t escape. They are trapped between the line y = ¢ and every isocline 1+ ¢t—y = s 
above or below it. The trap gets tighter and tighter as s increases from 0 to 1, and the iso- 
cline gets closer to y = t. Conclusion from the picture: The solution y(t) must approach 
t. 

This is a linear equation y’ + y = 1+ t. The null solutions to y’ + y = 0 are Ce~*. 
The forcing term 1 + ¢ is a polynomial. A particular solution comes by substituting 
Yyp(t) = at + b into the equation and solving for those undetermined coefficients a and b: 


(at +b)’ =1+t—(at+b) a=landb=0 y=yn+yp=Ce*t+t (2) 


The solution curves y = Ce~' +t do approach the line y = t asymptotically as t —> oo. 


Figure 3.5: The solution curves for y’ = 1 +t — y get trapped between the 45° isoclines. 


= REVIEW OF THE KEY IDEAS ® 


1. The direction field for y’ = f(t, y) has an arrow with slope f at each point t, y. 


. Along the isocline f(t, y) = s, all arrows have the same slope s. 


2 

3. The solution curves y(t) are tangent to the arrows. One way through isoclines ! 

4. Fact: When f and Of /Oy are continuous, the curves cover the plane and don’t meet. 
5 


. The solution curves for autonomous y’ = f(y) shift left-right to Y(t) = y(t — T). 
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Problem Set 3.1 


(a) Why do two isoclines f(t, y) = s; and f(t, y) = s2 never meet ? 
(b) Along the isocline f(t, y) = s, what is the slope of all the arrows ? 


(c) Then all solution curves go only one way across an 


(a) Are isoclines f(t, y) = si and f(t, y) = s2 always parallel ? Always straight ? 
(b) An isocline f(t,y) = s is a solution curve when its slope equals 


(c) The zerocline f(t, y) = 0 is a solution curve only when y is : slope 0. 


If yi(0) < yo2(0), what continuity of f(t,y) assures that yi(t) < y(t) for all t? 


The equation dy/dt = t/y is completely safe if y(0) # 0. Write the equation as 
y dy = t dt and find its unique solution starting from y(0) = —1. The solution curves 
are hyperbolas—can you draw two on the same graph ? 


The equation dy/dt = y/t has many solutions y = Ct in case y(0) = 0. It has 
no solution if y(0) 4 0. When you look at all solution curves y = Ct, which points 
in the t, y plane have no curve passing through ? 


For y’ = ty draw the isoclines ty = 1 and ty = 2 (those will be hyperbolas). 
On each isocline draw four arrows (they have slopes 1 and 2). Sketch pieces of solution 
curves that fit your picture between the isoclines. 


The solutions to y’ = y are y = Ce’. Changing C gives a higher or lower curve. But 
y' = y is autonomous, its solution curves should be shifting right and left! 


Draw y = 2e* and y = —2e! to show that they really are right-left shifts of y = e* 
and y = — e*. The shifted solutions to y’ = y are e+ © and — e't ©, 


For y’ = 1 — y? the flat lines y = constant are isoclines 1 — y? = s. Draw the 
lines y = 0 and y = 1 and y = —1. On each line draw arrows with slope 1 — y?. 
The picture says thaty = ____s and y = are steady state solutions. From 
the arrows on y = 0, guess a shape for the solution curve y = (e' — e~")/(e! + e7°). 


The parabola y = t?/4 and the line y = 0 are both solution curves for y’ = \/|y]. 
Those curves meet at the point t = 0, y = 0. What continuity requirement is failed 
by f(y) = V/ly|, to allow more than one solution through that point ? 


Suppose y = 0 up to time T is followed by the curve y = (t — T’)?/4. Does this 
solve y’ = ,/|y|? Draw this y(t) going through flat isoclines \/|y| = 1 and 2. 


The equation y’ = y? — t is often a favorite in MIT’s course 18.03: not too easy. 
Why do solutions y(t) rise to their maximum on y? = ¢ and then descend ? 


Construct f(t,y) with two isoclines so solution curves go up through the higher 
isocline and other solution curves go down through the lower isocline. True or false : 
Some solution curve will stay between those isoclines: A continental divide. 
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3.2 Sources, Sinks, Saddles, and Spirals 


The pictures in this section show solutions to Ay” + By’ + Cy = 0. These are linear 
equations with constant coefficients A,B, and C’. The graphs show solutions y on the 
horizontal axis and their slopes y’ = dy/dt on the vertical axis. These pairs (y(t), y’(t)) 
depend on time, but time is not in the pictures. The paths show where the solution goes, 
but they don’t show when. 

Each specific solution starts at a particular point (y(0),y’(0)) given by the initial 
conditions. The point moves along its path as the time ¢ moves forward from t = 0. 
We know that the solutions to Ay” + By’ + Cy = 0 depend on the two solutions to 
As? + Bs+C = 0 (an ordinary quadratic equation for s). When we find the roots s, 
and s2, we have found all possible solutions : 


t 


y = cje** + coe? y’ = c18,e%"* + cosge5?" (1) 


The numbers s; and s2 tell us which picture we are in. Then the numbers c, and cz tell us 
which path we are on. 

Since s; and sg determine the picture for each equation, it is essential to see the six 
possibilities. We write all six here in one place, to compare them. Later they will appear in 
six different places, one with each figure. The first three have real solutions s; and sz. The 
last three have complex pairs s = a + iw. 


Sources Sinks Saddles Spiral out Spiral in Center 
8, >82>0 81 <52<0 s9<0<58, a=Res>0 a=Res<0 a=Res=0 


In addition to those six, there will be limiting cases s = 0 and s; = sg (as in resonance). 
Stability This word is important for differential equations. Do solutions decay to zero? 
The solutions are controlled by e** and e*2* (and in Chapter 6 by e*** and e%*), 
We can identify the two pictures (out of six) that are displaying full stability: the sinks. 


A center s = -tiw is at the edge of stability (e*““ is neither decaying or growing). 


2. Sinks are stable 81 < 82 <0 Then y(t) > 0 
5. Spiral sinks are stable Res; =Reso<0 Then y(t) > 0 


Special note. May I mention here that the same six pictures also apply to a system of 
two first order equations. Instead of y and y’, the equations have unknowns y; and y2. 
Instead of the constant coefficients A,B,C, the equations will have a 2 by 2 matrix. 
Instead of the roots s; and sg, that matrix will have eigenvalues A; and Az. Those 
eigenvalues are the roots of an equation AX? + BA+C=0, just like s, and so. 

We will see the same six possibilities for the ’s, and the same six pictures. The 
eigenvalues of the 2 by 2 matrix give the growth rates or decay rates, in place of s; and so. 


ut | a | ; ‘| . has solutions ee = = jew 
Mt 


The eigenvalue is \ and the eigenvector is v = (v1, U2). The solution is y(t) = ve. 
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The First Three Pictures 


We are starting with the case of real roots s, and 83. In the equation Ay” + By’ + Cy = 0, 
this means that B? > 4AC. Then B is relatively large. The square root in the quadratic 
formula produces a real number Vo2=4AC. Ie A, B,C have the same sign, we have 
overdamping and negative roots and stability. The solutions decay to (0,0): a sink. 


If A and C have opposite sign to B as in y’”’ — 3y’ + 2y = 0, we have negative damping 
and positive roots s;, 52. The solutions grow (this is instability : a source at (0, 0)). 


Suppose A and C have different signs, as in y” — 3y’ — 2y = 0. Then s; and 82 also 
have different signs and the picture shows a saddle. The moving point (y(t), y/(t)) can 
start in toward (0,0) before it turns out to infinity. The positive s gives e%' — oo. 
Second example for a saddle: y" — 4y = 0 leads to s? — 4 = (s — 2)(s +2) = 0. 
The roots s; = 2 and sy = —2 have opposite signs. Solutions ce?" + cge~*! grow 
unless c; = 0. Only that one line with c; = 0 has arrows inward. 


In every case with B? > 4AC, the roots are real. The solutions y(t) have growing 
exponentials or decaying exponentials. We don’t see sines and cosines and oscillation. 


The first figure shows growth: 0 < s2 < s;. Since e*1* grows faster than e*2*, the larger 
number s; will dominate. The solution path for (y, y’) will approach the straight line of 
slope s1. That is because the ratio of y’ = c,s,e*"* to y = c,e*"" is exactly 8). 


If the initial condition is on the “s, line” then the solution (y,y’) stays on that line: 
co = 0. If the initial condition is exactly on the “sg line” then the solution stays on that 
secondary line: c; = 0. You can see that if c; 4 0, the c,e*!* part takes over as t — 00. 


Reverse all 
the arrows in 
the left figure. 
Paths go in 
toward (0, 0) 


0<s2< 81 81 <s2 <0 82 <0< 51 
Source: Unstable Sink : Stable Saddle: Unstable 


Figure 3.6: Real roots s; and s2. The paths of the point (y(t), y’(t)) lead out when roots 
are positive and lead in when roots are negative. With sz < 0 < sj, the s9-line leads in 
but all other paths eventually go out near the s,-line: The picture shows a saddle point. 
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Example for a source: y" — 3y’ + 2y = 0 leads to s? — 3s + 2 = (s — 2)(s —1) = O. 
The roots 1 and 2 are positive. The solutions grow and e?* dominates. 

Example for a sink: y” + 3y' + 2y = 0 leads to s? + 3s +2 = (s+ 2)(s +1) = 0. 
The roots —2 and —1 are negative. The solutions decay and e~* dominates. 


The Second Three Pictures 


We move to the case of complex roots s; and s2. In the equation Ay” + By’ + Cy = 0, 
this means that B? < 4AC. Then A and C have the same signs and B is relatively small 
(underdamping). The square root in the quadratic formula (2) is an imaginary number. 
The exponents s; and s2 are now a complex paira+iw: 


Complex roots of B VJB?—4AC ; 
Age es CSO $1, $2 = -—— =atw. (2) 


2A 2A 


The path of (y,y’) spirals around the center. Because of e%, the spiral goes out 
if a > 0: spiral source. Solutions spiral in if a < 0: spiral sink. The frequency w 
controls how fast the solutions oscillate and how quickly the spirals go around (0, 0). 

In case a = —B/2A is zero (no damping), we have a center at (0, 0). The only terms 
left in y are e“”’ and e~*, in other words coswt and sinwt. Those paths are ellipses in 
the last part of Figure 3.7. The solutions y(t) are periodic, because increasing t by 27/w 
will not change cos wt and sin wt. That circling time 27'/w is the period. 


Reverse all 
the arrows in 
the left figure. 
Paths go in 
toward (0,0). 


a=Res>0O a=Res <0 a=Res=0 
Spiral source : Unstable Spiral sink : Stable Center: Neutrally stable 


Figure 3.7: Complex roots s; and sz. The paths go once around (0,0) when ¢ increases 
by 22/w. The paths spiral in when A and B have the same signs and a = —B/2A is 
negative. They spiral out when a is positive. If B = 0 (no damping) and 4AC’ > 0, 
we have acenter. The simplest center is y = sin t, y’ = cos t (circle) from y” + y = 0. 
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First Order Equations for y; and y9 


On the first page of this section, a “Special Note” mentioned another application of the same 
pictures. Instead of graphing the path of (y(t), y’(t)) for one second order equation, we 


could follow the path of (yi(t), y2(t)) for two first order equations. The two equations 
look like this: 


dt =ay, + by 
First order system y’ = Ay dy / : ; (3) 


dy2/dt =cyi + dy2 


The starting values y;(0) and y2(0) are given. The point (y1, y2) will move along a path 
in one of the six figures, depending on the numbers a, b,c, d. 

Looking ahead, those four numbers will go into a 2 by 2 matrix A. Equation (3) will be- 
come dy/dt = Ay. The symbol y in boldface stands for the vector y = (yi, 4y2). 
And most important for the six figures, the exponents s, and s in the solution y(t) 
will be the eigenvalues A and Az of the matrix A. 


Companion Matrices 


Here is the connection between a second order equation and two first order equations. All 
equations on this page are linear and all coefficients are constant. I just want you to see the 
special “companion matrix” that appears in the first order equations y’ = Ay. 

Notice that y is printed in boldface type because it is a vector. It has two components y1 
and yz (those are in lightface type). The first y; is the same as the unknown y in the second 
order equation. The second component yp is the velocity dy/dt : 


if ie y y" + 4y’ + 3y=0 becomes yo’ + 4y2+3y1=0. (4) 
On the right you see one of the first order equations connecting y; and yo. We need 
a second equation (two equations for two unknowns). Jt is hiding at the far left! There 
you see that yi’ = ye. In the original second order problem this is the trivial statement 
y’ = y'’. In the vector form y’ = Ay it gives the first equation in our system. 
The first row of our matrix is O 1. When y and y’ become y; and yo, 


/ 
” ! — Yin YO, 4 ; ue 
y +4y +3y=0 becomes yo’ = —3y1 -4y2 Ee -| i 


That first row O 1 makes this a 2 by 2 companion matrix. It is the companion to the 
second order equation. The key point is that the first order and second order 
problems have the same “characteristic equation” because they are the same problem. 


The equation s?+ 4s +3 =0 gives the exponents s, =—3 and sy =—1 


The equation A? +4 +3 =0 gives the eigenvalues A, = —3 and A» =—1 
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The problems are the same, the exponents —3 and —1 are the same, the figures will be 
the same. Those figures show a sink because —3 and —1 are real and both negative. 
Solutions approach (0, 0). These equations are stable. 


The companion matrix for y” + By’ +Cy =0is A = a = | ‘ 


Row 1 of y’ = Ay is yj = yz. Row 2 is ys = —Cy, — Byg. When you replace yz by y/{, 
this means that y{/ + By{ + Cy; = 0: correct. 


Stability for 2 by 2 Matrices 


I can explain when a 2 by 2 system y’ = Ay is stable. This requires that all solutions 
y(t) = (yi(t), yo(t)) approach zero as t + oo. When the matrix A is a companion matrix, 
this 2 by 2 system comes from one second order equation y” + By’ + Cy = 0. In that case 
we know that stability depends on the roots of s? + Bs + C = 0. Companion matrices are 
stable when B > O and C > 0. 

From the quadratic formula, the roots have s; + sz = —B and s}s2 = C. 

If s; and s2 are negative, this means that B > Oand C' > 0. 

If s; =a+ iw and sz = a — iw anda < 0, this again means B > 0 and C' > 0 
Those complex roots add to s1; + sg = 2a. Negative a (stability) means positive B, since 
81 +82 = —B. Those roots multiply to s;s2 = a? +w?. This means that C is positive, since 
$152 = Cc: 

For companion matrices, stability is decided by B > 0 and C' > 0. What is the stability 
test for any 2 by 2 matrix ? This is the key question, and Chapter 6 will answer it properly. 
We will find the equation for the eigenvalues of any matrix (Section 6.1). We will test 
those eigenvalues for stability (Section 6.4). Eigenvalues and eigenvectors are a major topic, 
the most important link between differential equations and linear algebra. Fortunately, the 
eigenvalues of 2 by 2 matrices are especially simple. 


The eigenvalues of the matrix A = : : have 1? = PAE D = 0; 


The number T is a + d. The number D is ad — be. 


Companion matrices have a = 0 and b = 1 andc = —C'andd = —B. Then the characteris- 
tic equation A? — T\ + D = Ois exactly s? + Bs +C =0. 


Companion matrices have a a T=a+d=-B and D=ad-bc=C. 


The stability test B > O and C > Ois turning into the stability test T < 0 and D > 0. 


This is the test for any 2 by 2 matrix. Stability requires 7 < 0 and D > 0. Let me give 
four examples and then collect together the main facts about stability. 
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Aj = | ; : | is unstable because T’ = 0 + 3 is positive 
|- 
es ines oe ; 
Ap = 9 3 | 3 unstable because D = —(1)(2) is negative 
L wel 
PAO sls ha 
A3 = = -3 | is stable because J’ = —3 and D = +2 
= ae 2) ee 
Ag= et is stable because T' = —1 — 1 is negative 
eee | 


and D=1-+1is positive 


The eigenvalues always come from A? —T+ D = 0. For that last matrix A4, this eigenvalue 
equation is NASON cD? S e° 20); The eigenvalues are A; = —1+2 and 
Ag = —1—1. They add to T = —2 and they multiply to D = +2. This is a spiral 
sink and it is stable. 


Stability for 
2 by 2 matrices 


T=a +d <0 


a b ; : 
a=(¢ a is stable if Daud beso 


The six pictures for (y, y’) become six pictures for (y1,y2). The first three pictures have 
real eigenvalues from T? > 4D. The second three pictures have complex eigenvalues from 
T? < 4D. This corresponds perfectly to the tests for y” + By’ +Cy = 0 and its companion 
matrix : 


Real eigenvalues T?>4D B?>4C — Overdamping 
Complex eigenvalues T?<4D 8B? <4C  Underdamping 


That gives one picture of eigenvalues \: Real or complex. The second picture is different: 
Stable or unstable. Both of those splittings are decided by T’ and D (or —B and C). 


1. Source T>0, D>0, T? >4D Ustable 

2. Sink BS 0D S01? > AD: Stable 

3. Saddle D<0O and T?>4D Unstable 

4. Spiralsource JT >0, D>0, T? <4D Unstable 

5. Spiral Sink T <0, D>0, T? <4D Stable 

6. Center T=0, D>0, T? <4D Neutral 
That neutrally stable center has eigenvalues Aj = tw and Ap = —iw and undamped oscilla- 
tion. 


Section 3.3 will use this information to decide the stability of nonlinear equations. 


3.2. Sources, Sinks, Saddles, and Spirals 167 


Eigenvectors of Companion Matrices 


Eigenvalues of A come with eigenvectors. If we stay a little longer with a companion 
matrix, we can see its eigenvectors. Chapter 6 will develop these ideas for any matrix, 
and we need more linear algebra to understand them properly. But our vectors (yi, y2) 
come from (y,y’) in a differential equation, and that connection makes the eigenvectors 
of a companion matrix especially simple. 

The fundamental idea for constant coefficient linear equations is always the same: 
Look for exponential solutions. For a second order equation those solutions are 
y = e*'. For a system of two first order equations those solutions are y = ve**. The 
vector v = (v1, U2) is the eigenvector that goes with the eigenvalue X. 


ch AONE ’ 
. = vie . d a fj b 
Substitute 7 ; yt into the equations v1 Yi + Oye 


and factor out e**. 
Yy2 = U2€ yg = cy + dy2 


Because e* is the same for both y1 and ye, it will appear in every term. When all factors ert 


are removed, we will see the equations for v; and v2. That vector v = (v1, v2) will satisfy 
the eigenvector equation Av = Av. This is the key to Chapter 6. 

Here I only look at eigenvectors for companion matrices, because v has a specially nice 
form. The equations are y/ = yz and yx = —Cyi — Byo. 


At At At 
: Y1 = v1e Avyje™ = v2e 
Substitute Then 
yo = ve" Avge** = —Cv, e** — Bue. 


Cancel every e>*. The first equation becomes \v1 = v2. This is our answer: 


ik 


ee 


Eigenvectors of companion matrices are multiples of the vector v = 


= REVIEW OF THE KEY IDEAS #® 


1. If B? 4 4AC £0, six pictures show the paths of (y,y’) for Ay” + By’ + Cy = 0. 
2. Real solutions to As? + Bs + C = 0 lead to sources and sinks and saddles at (0,0). 


3. Complex roots s = a + iw give spirals around (0, 0) (or closed loops if a = 0). 


/ 
: Yy = 0 1 y md Be 
4. Roots s become eigenvalues \ for ¥,| = ee | | . Same six pictures. 


168 


10 


11 


Chapter 3. Graphical and Numerical Methods 


Problem Set 3.2 


Draw Figure 3.6 for a sink (the missing middle figure) with y = cye~?* + cge~*. 
Which term dominates as t —> oo? The paths approach the dominating line as they 
go in toward zero. The slopes of the lines are —2 and —1 (the numbers s, and s2). 


Draw Figure 3.7 for a spiral sink (the missing middle figure) with roots s = —1 +1. 
The solutions are y = Cy,e~‘*cost + Cze~*tsint. They approach zero because 
of the factor e~*. They spiral around the origin because of cost and sint. 


Which path does the solution take in Figure 3.6 if y = e’ + e'/?? Draw the 
curve (y(t), y’(t)) more carefully starting at t = 0 where (y, y’) = (2, 1.5). 


Which path does the solution take around the saddle in Figure 3.6 if y = e'/? + e—*? 
Draw the curve more carefully starting at t = 0 where (y, y’) = (2, —3). 


Redraw the first part of Figure 3.6 when the roots are equal: s; = sg = 1 and 
y = cye' + cote*. There is no s-line. Sketch the path for y = e* + te’. 


The solution y = e?* — 4e! gives a source (Figure 3.6), with y’ = 2e?¢ — 4e’. Starting 
at t = 0 with (y,y’) = (—3, —2), where is (y, y’) when e* = 1.1 and et = .25 and 
ea? 


The solution y = e’(cost + sint) has y’ = 2e‘ cost. This spirals out because of e*. 
Plot the points (y, y’) at t = 0 and t = 7/2 and t = 7, and try to connect them with a 
spiral. Note that e”/? = 4.8 and e” = 23. 


The roots s; and sj are +27 when the differential equation is ___._ Starting from 
y(0) = 1 and y’(0) = 0, draw the path of (y(t), y’(t)) around the center. Mark the 
points when t = 7/2, 7, 37/2, 27. Does the path go clockwise ? 


The equation y” + By’ + y = 0 leads to s? + Bs +1 = 0. For B = —3, —2, —1, 0, 
1, 2, 3 decide which of the six figures is involved. For B = —2 and 2, why do we not 
have a perfect match with the source and sink figures ? 


For y” + y’ + Cy = 0 with damping B = 1, the characteristic equation will be 
s?+s+C =0. Which C gives the changeover from a sink (overdamping) to a spiral 
sink (underdamping)? Which figure has C < 0? 


Problems 11-18 are about dy/dt = Ay with companion matrices = e a = : 


The eigenvalue equation is AX? + BA +C=0. Which values of B and C give 
complex eigenvalues ? Which values of B and C give \y = A»? 
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12 


13 
14 


15 


16 


17 


18 


19 


20 


Find \; and Az if B = 8 and C = 7. Which eigenvalue is more important as t 00? 
Is this a sink or a saddle? 


Why do the eigenvalues have \] + Az = —B? Why is AyAg = C'? 


Which second order equations did these matrices come from ? 


Ay= : : (saddle) Ag = | a ; | (center) 


The equation y” = 4y produces a saddle point at (0,0). Find s; > 0 and sz < 0 
in the solution y = c,e°1" + cge*2*. If c1c2 ¥ 0, this solution will be (large) (small) as 
t > oo and also as t > —o0. 


The only way to go toward the saddle (y, y’) = (0,0) ast > cois ci; = 0. 


If B = 5and C = 6 the eigenvalues are 4} = 3 and Aj = 2. The vectors v = (1,3) 
and v = (1,2) are eigenvectors of the matrix A: Multiply Av to get 3v and 2v. 


In Problem 16, write the two solutions y = vet to the equations y’ = Ay. 
Write the complete solution as a combination of those two solutions. 


The eigenvectors of a companion matrix have the form v = (1,A). Multiply by A 
to show that Av = Av gives one trivial equation and the characteristic equation \? + 
Br+C =0. 


ae | Lala ip eX 
ag =. 6, | h R = HORN a2 


Find the eigenvalues and eigenvectors of A = : : ; 


An equation is stable and all its solutions y = c,e*** + cpe*’ goto y(oo) = 0 
exactly when 


(s1 < Oor so < 0) (s; < Oand so < 0) (Re s; < Oand Re sz < 0)? 


If Ay” + By’ + Cy = Dis stable, what is y(oo) ? 
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3.3. Linearization and Stability in 2D and 3D 


The logistic equation y’ = y—y? has two steady states Y = 0 and Y = 1. Those are critical 
points, where the function f(y) = y — y? is zero. Along the lines Y = 0 and Y = 1 the 
equation y’ = f(y) becomes 0 = 0. We have those two steady solutions, and their stability 
or instability is important. Do nearby solutions approach Y or not? 
The stability test requires df /dy < 0 at Y. This is the slope of the tangent to f(y) : 
af 
fu-¥) x 507) + (FZ) WY) =04 Aw -¥). ) 
The linearization of y’ = f(y) at the critical point y = Y comes from f & A(y — Y). 
Replace f by this linear part and include the constant Y on the left side too: 


Linearized equation near a critical point Y (y—Y)’=A(y-Y). (2) 


The solution y — Y = Ce*! grows if A > 0 (instability). The solution decays if A < 0. 
The logistic equation has f(y) = y — y? with derivative A = 1 — 2y. At the steady state 
Y = 0 this shows instability (A = +1). The other critical point Y = 1 is stable (A = —1). 


The stability line or phase line in Section 1.7 showed Y = 1 as the attractor: 


y(t) ®— co Y=0 yt)wl Y=1 y(t) 


left arrows: y — y? <0 y—y?>0 left arrows: y — y? <0 


The arrows in Section 3.1 had slopes f(t, y). Stability is decided by the slope df /dy. 


Note The most basic example is y’ = y. The only steady state solution is Y = 0. That 
must be unstable, because f = y has A = df /dy = 1. All other solutions y(t) = Ce’ travel 
far away from Y = 0, even when C' = y(0) is close to zero. 

Opposite case: y' = 6 — yis stable (A = —1). Solutions approach Y = yoo = 6. 


Solution Curves in the yz Plane 


Those paragraphs were review for one unknown y(t). Section 3.2 had two unknowns y and 
z in two linear first order equations (or y and y’ in a linear second order equation). 
Move now to nonlinear. The equations will be autonomous, the same at all times ¢: 


d d 
t = f(y,z) and ; =g(y,z) starting from (0) and z(0). (3) 


A critical point Y, Z solves f(Y,Z) = 0 and g(Y, Z) = 0. It is a steady solution: constant 
y = Y and constant z = Z. 
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Critical point f(Y,Z) =0 and g(Y, Z) = 0 (4) 


For every critical point Y, Z we must decide: stable or unstable or neutral ? 

To graph the solutions, there is a problem with y and z and t. Three variables won’t fit 
into a 2D picture. Our solution curves for autonomous equations will omit ¢t. The curves 
y(t), z(t) show the paths of solutions in the y, z plane but not the times along those paths. 


Those pictures do not show the time t, as the solution moves. Different equations 
dy/dt = cf(y,z) and dz/dt = cg(y,z) will produce the same picture for all c # 0. 
That constant c just rescales the time and the speed along the same path y(ct),z(ct). 
Time and speed are not shown by the pictures. 

Each steady state y(t) = Y,z(t) = Z will be one point in the picture! The stability 
question is whether paths near that point (those are nearby solutions) go in toward Y, Z 
or away from Y, Z or around Y, Z: stable or unstable or neutrally stable. 

That stability question is answered by the eigenvalues of a 2 by 2 matrix A. 


Solutions Near a Critical Point 


Here is the key to this section. Very close to a critical point where f(Y, Z) = 0 and 
g(Y, Z) = 0, solution curves have the same six possibilities that we already know: 


Stable Sink Unstable Source 
Spiral sink Spiral source 
Neutral Center Saddle point 


The pictures for linear equations were in Section 3.2. They came from six possibilities 
for the roots of As? + Bs + C = 0, and from six types of 2 by 2 matrices A: 


II 


Linear equations y’ =ay + bz y 
Constant coefficients z cy + dz z 


Those model problems in 2D have the critical point Y = 0, Z = 0. That is the point where 
f(y, z) = ay + bz = O and g(y, z) = cy + dz = 0. There is one critical point (0,0) at the 
center of each picture in Section 3.2. Now we are saying that nonlinear equations look like 
linear equations when you look near each critical point. 

This is the 2D equivalent of one equation (y — Y)’ = A(y — Y). That number A 
was df /dy. Now we have two unknowns y and z, and two functions f(y, z) and g(y, z). 
There are four partial derivatives of f and g, and they go into the 2 by 2 matrix A: 


First derivative matrix Aen Of /Oy Of /dz (6) 
“Jacobian matrix” ~ | Og/Oy Og/dz 


172 Chapter 3. Graphical and Numerical Methods 


Linearization of a Nonlinear Equation 


For one equation, linearization was based on the tangent line. The beginning of the Taylor 
series around Y is f(Y) + (df/dy)(y — Y). Critical points have f(Y) = 0, removing the 
constant term. Two variables y and z lead to the same idea, but now it is a tangent plane: 


fly.2) = 104.2) + (34) w-¥) + (FE) @-2) r 
atn2) © o(%Z) + (32) w—¥) + (2) = 2) 
=| ahiov anos | | ® 


There stands the linearized equation. It is centered and linearized around the special point 
(Y, Z). If we reset by shifting (Y, Z) to (0,0), equation (8) is one of our model problems : 


|%, |=4[%]=[¢ Ae (9) 


Example 1 Linearize y’ = sin(ay + bz) and z’ = sin(cy + dz) atY =0,Z =0. 


Solution Check first: f = sin(ay + bz) and g = sin(cy + dz) are zero at (Y, Z) = (0,0). 
This is a critical point. The first derivatives of f and g at that point go into A. 


Of /Oy = acos(ay + bz) = acosO = a when (y, z) = (0,0) 


The other three partial derivatives give b and c and d. They enter the matrix A: 


/ G / 
y’ = sin(ay + bz) ; : yo=aytbz  |a b y 
Peele) linearizes to acide Tlened lhe (10) 


That example just moved the simple linearization sin x ~ x into two variables. 


fo 
Example 2 (Predator-Prey) Linearize i, = ie a e at all critical points. 
Meaning of these predator-prey equations The prey y is like rabbits, the predator 
z is like foxes. On their own with no foxes, the rabbits grow by nibbling grass: y’ = y. 
On their own with no rabbits, the foxes don’t eat well and z’ = —z. Then the 
multiplication yz accounts for the interactions between y rabbits and z foxes. Those 
interactions end up in more foxes and fewer rabbits. 

This example has simplified coefficients 1 and —1 multiplying y and z and yz. 
The predator-prey model is a great example and we will develop it further. 
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Linearize Predator-—Prey at Critical Points 


Set f = Y —-YZ = Oandalsog = YZ — Z = 0. Solve for all critical points Y, Z. 


Y-YZ=Y(1-Z)=0 and YZ-—-Z=(Y -1)Z=0. 
The critical points Y, Z are 0,0 and 1, 1. Track their stability using the matrix A. 


= Al POP POW Of OZ Wei ia aye eT 0 
REN GZ aE eh Bai onmena OPN Wea ayes ge alls 
This is a saddle point: unstable. Starting near 0,0 the rabbit population y(t) will grow. 
The eigenvalues are 1 (for the rabbits) and —1 (for the foxes) from y’ = y and z’ = —z. 
An all-fox population would decay (this is the only path in to the saddle point). 


iA as 0 -1 
At Y,Z = 1,1 A=| Z eles nd 
This matrix has imaginary eigenvalues 4; = 2 and Ag = —z. Their real parts are zero. 
The stability is neutral. The critical point Y = 1, Z = 1 is a center. A solution that 


starts near that point will go around 1, 1 and return where it started: 


Extra rabbits —> Foxes increase —> Rabbits decrease —> Foxes decrease — Extra rabbits 


We can see without eigenvalues that the solution to the linearized equations makes a 
perfect circle around (1, 1). The matrix A has —1 in row 1 and +1 in row 2. 


(11) 


: y—1 =rcost 
is solved by ices meer 
The actual nonlinear solution y(t), z(t) won’t make a perfect circle. Usually we can’t 
find its exact path, but in this case we can. The y — z equation is separable and solvable: 


dy _dy/dt_f _ y(l—z) Wa. least! 1-—z 
= = = tes int dy = dz. 12 
edo = ais separates into . y z z (12) 


Integration of 1 and 1/y and 1/z gives y—Iny =Inz—2z+C. That constant is 
C = 2 when y = z = 1 (critical). These solution curves are drawn in Figure 3.8 for 
C = 2.1,2.2,2.3, 2.4. They are nearly circular near C' = 2. That is linearization ! 

As C increases, y and z move further away from 1 and the circles are lost. But the 
nonlinear solution is still periodic. The rabbit-fox population comes back to its starting 
point and goes around again. Populations can be close to cyclic. 

Equation (12) took time out of the picture. A numerical solution (Euler or Runge-Kutta) 
puts time back. This famous model came from Lotka and Volterra in 1925. 
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Og | 
foxes z a 
see OSA 
z=1¢ 
saddle 
point | — : - rabbits y 


Figure 3.8: Solution paths y + z — In y — Inz = C around the critical point: a center. 


Predator — Prey — Logistic Equation 


When Example 2 has no foxes (z = 0), the rabbit equation is y’ = y. There is no control of 
rabbits and y = Ce’. When we add a logistic term like —qy? (rabbits eventually competing 
with rabbits for available lettuce) this makes the equations more realistic. 

We also allow different coefficients p, r, s, t (not all 1 or —1) in the other terms: 


First critical point (Y, Z) = (0, 0) 
Second point (Y, Z) = (p/q, 0) 
Third s = wY andp=qY +rZ 


Rabbits y’ = y(p— qy — rz) 
Foxes z’=2z(—s+wy) 


At those critical points, y’ and z’ are zero. The solutions are steady states y= Y, z = Z. 
Near those points we linearize the equation to decide stability. The derivatives of 
f(y, z) and g(y, z) are incontrol, because f = g = 0 at the critical points: 


First derivatives | Of/Oy Of /Oz p—2qy-rTrz —ry Sipe, te 
Jacobian at0,0 | Og/Oy Og/dz | — Wz —s+wy| |0—-s | 
(0,0) is a saddle point: unstable. Small populations have y’ ~ py and z’ & —sz. 
Rabbits increase and foxes decrease. One eigenvalue p is positive, the other eigenvalue 
—s is negative. Near this (0,0) point, the competition terms —qy? and —ryz and wyz 
are higher order. Those terms disappear in the linearization. 

The second critical point has Y = p/q and Z = 0. This point is a sink or a saddle: 


Linearization y—Y ; ee eee ng ea =D iG 
around (p/q,0) |z-Z]| “|2z-Z |. SCOe ae aip yg 
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If s > wp/q, that last entry is negative. So is —q, and we have a sink: two negative eigen- 
values. 
If s < wp/g, that last entry is positive. In this case we have a saddle. 

The third critical point (Y, Z) is different. At this pointp = qY +rZ ands = wY. 
This leaves only three simple terms in the first derivative matrix above: 


/ 
Linearization y—-Y 4 | Oe ; [=e =—r¥ 
around (Y, Z) pa =4[¥7 | ma =| 3o 0 
The new term —qy” in the rabbit equation has produced —gY = —qs/w in the matrix A. 


This is a negative number, it stabilizes the equation. It pulls both of the eigenvalues 
(previously imaginary) to negative real parts. Neutral stability changes to full stability. 

2 by 2 matrices are special (with only two eigenvalues A; and A2). I can reveal the two 
facts that produce those two eigenvalues of A: Add the \’s and multiply the ’s. 


Sum A, + Ag equals the sum T of diagonal entries T = —qY 
Product A, Ag equals the determinant D of the matrix D=rYwZ 
Our matrix has A; + Az < 0 and AjAz > O. This suggests two negative eigenvalues 


A, and Az (a sink). It also allows 4; = a+ iband Ag = a — ib (a < O, a spiral sink). 
Our conclusion is: The third critical point Y, Z is stable. 


Final Tests for Stability : Trace and Determinant 


We can bring this whole section together. It started with finding the critical points Y, Z and 
linearizing the differential equations. Now we can give simple tests on the 2 by 2 
linearized matrix A. We don’t need to compute the eigenvalues before testing them— 
because the matrix immediately tells us their sum A, + Ag and their product A,A2. 
That sum and product (the trace and determinant of A) are all we need. 


Step 1 Find all critical points (steady states) of y’ = f(y,z) and z’ = g(y, z) 
by solving f(Y, Z) = 0 and g(Y, Z) =0. 
Step 2  Ateach critical point find the matrix A from derivatives of f and g 
ea? bees | Of Oye Of/0z : 
a=[¢ aeaeens A9/dz at the point Y, Z 
Step 3 Decide stability from the trace T = a + dand determinant D = ad — bc 


Unstable T>0 or D<Oorboth 
Neutral P=Oxand-D2> 0 


Stable T <0 and D>0O 


If T? > 4D > 0, the stable critical point is a sink: real eigenvalues less than zero. 
If T? < 4D, the stable critical point is a spiral sink : complex eigenvalues with Re \ < 0. 
Section 6.4 will explain these rules and draw the stable region T < 0, D > 0. 
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The solution curves y(t), z(t) are paths in the yz plane. Near each critical point Y, Z, 
the paths are close to one of the six possibilities in Section 3.2. Source, sink, or saddle 
for real eigenvalues ; Spiral source, spiral sink, or center for complex eigenvalues. 


A Special 3 by 3 System: A Tumbling Box 


You understand that 3 by 3 systems will be more complicated. The pictures don’t stay in 
a plane. There are 9 partial derivatives of f, g, h with respect to x, y, z. The matrix A 
with those entries is 3 by 3. Its three eigenvalues decide stability (T’ and D are not enough). 

But we live in three dimensions. The most ordinary motions will follow a space curve 
and not a plane curve. We can imagine the whole of three-dimensional space filled with 
those curves—that picture is hard to draw. Still there are important special motions that 
we can understand (and even test for ourselves). Here is a beautiful example. 


Throw a closed box up in the air. Throw a cell phone. Throw this book. 
Those all have unequal sides s; < s2 < s3. Gravity will bring the book or the box back 
down, but that is not the interesting part. The key is how it turns in space. 

There are three special ways to throw the box. It can rotate around the short side s,. 
It can rotate around the longest side s3. The box can try to rotate around its middle side sz. 
Those three motions will be critical points. Your throwing experiment will quickly find that 
two of the rotations are stable and one is unstable. In this book on differential equations, 
we want to understand why. Please put a rubber band around the book. 

Since the up and down motion from gravity is not important, we will remove it. 
Keep the origin (0,0, 0) at the center of the box. The box turns around that center point. 
At every moment in time, a 3D rotation is around an axis. If the box tumbles around 
in the air, that rotation axis is changing with time. 

After writing about boxes I thought of another important example. Throw a football. 
If you throw it the right way, spinning around its long axis, it flies smoothly. Any 
quarterback does that automatically. But if your arm is hit while throwing, the ball wobbles. 
A football has one long axis and two equal short axes, $1 = s2 < 83. 

One more: A well-thrown frisbee spins around its short axis (very short). Its long axes 
go out to the edges of the frisbee, so s1 < sg = 53. A bad throw will make it tumble. 


Tumbling indicates an unstable critical point for the equations of motion. 


Equations of Motion: Simplest Form 


For a box of the right shape, Euler found these three equations. The unknowns z, y, z give 
the angular momentum around axes 1, 2, 3 (short, medium, long). 


emeeeye dx/dt= yz Critical points X,Y, Z have f = g =h=0 
g(a@,y,z) dy/dt =—2xz There are 6 critical points on a sphere 
h{z,y,z) dz/dt= «xy (X,Y, Z) = (£1,0,0) (0,+1,0) (0,0, +1) 
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Multiply the three equations by z, y, z and add them together, to see the sphere: 


d d d 
a + ye +25 = wy2 — Qey2 + cy2 = 0 x? + y? + z? = constant. 
The point x,y,z travels on a sphere. There are six critical points X,Y, Z (steady rota- 
tions). The question is, which steady states are stable ? Try the experiment. Toss up a book. 


Linearize at Each Critical Point 


When you take 9 partial derivatives of f = yz and g = —2xz andh = cy, you get the 3 
by 3 Jacobian matrix J. Its first row 0 z y contains the partial derivatives of f = yz. 
At each critical point, substitute X,Y, Z into J to see the matrix A in the linearized equa- 
tions. The six critical points (X, Y, Z) are (+1, 0,0) and (0, +1, 0) and (0,0, +1). 


, 2 FF 0 0 0 0 01 Ocale v0 
J=| -2z 0 —-2z +A=|0 0 -2 0 0 0 —2 0 0 
y 2 0 0 1 0 g, - 07 20 0 0 0 


That middle matrix A with two ones gives instability around the point (0, 1,0). Start the 
linearized equations from the nearby point (c, 1, c). 


a! 001 x a’=z x = cet 
y’ |}=]0 0 0 y is y’=O0O Then y=1 (13) 
z! 100 z reir z= ce 


Those solutions with e® are leaving the critical point. You are seeing the eigenvalue \ = 1. 
The other eigenvalues are 0 and —1: a saddle point. When you try to spin a box around 
its middle axis, the wobble quickly gets worse. Jt is humanly impossible to spin the box 
perfectly because that axis is unstable. 

The other two axes are neutrally stable. Their matrices A have —2 and +1. Their 
eigenvalues are J2 7 and ~V/2i and 0. Around the short axis (1,0, 0), the essential part 
of A is 2 by 2. We see sines and cosines (not e* and instability) : 


A ra 0.20 0 x 0 mal 
yy’ |=|0 0 —2 y | =| —22 |. Then y=v2ccos(V2t) 
Ze Ge ah 940) Z y z= c sin(V2t) 


The turning axis (x,y,z) travels in an ellipse around (1,0,0). This indicates a center. 
Let me go back to the nonlinear equations to see that elliptical cylinder y? + 22? = C. 


Multiply x’ = yz,y’ =—22rz,z'=ay by 0,y,2z. Addtoget yy’ + 2zz’=0. 


The derivative of y? + 2z? is zero. Every path x(t), y(t), z(t) is an ellipse on the sphere. 
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Alar Toomre’s Picture of the Solutions 


At this point we know a lot about every solution to x’ = yz and y’ = —2xz and z! = zy. 


Stays on a sphere xz? +y?+27=C; Multiply the equations by z, y, z. 

Stays on an elliptical cylinder 222+ y? =C2 Multiply by 22,y,0 and add. 

Stays on an elliptical cylinder y? +22? =C3 Multiply by 0,y,2z and add. 

Stays on a hyperbolic cylinder x? —z?2 =C, Multiply by x,0,—z and add. 
Professor Alar Toomre made the tumbling box famous among MIT students. The year when I 
went to his 18.03 lecture, he tossed up a book several times (in all three ways). 
The book turned or tumbled around its short and middle and long axes: stable, unstable, 
and stable. Actually the stability is only neutral, and wobbles don’t grow or disappear. 

Maybe you can see those ellipses around two critical points : cylinders intersect a sphere. 
The website will show one of those cylinders going around (1,0, 0): a neutrally stable case. 
It is harder to visualize the hyperbolas x? — z? = C4 around the unstable point (0, 1,0). 

This figure shows the value of seeing a solution—not just its formula. With good fortune 
a video of this experiment will go onto the book’s website math.mit.edu/dela. 


Figure 3.9: Toomre’s picture of solution paths x(t), y(t), z(¢) from Euler’s three equations. 


I will end this example with a square box: two equal axes. The symmetry of a football 
also produces two equal axes. The Earth itself is flatter near the North Pole and South Pole, 
and symmetric around that short axis. Fortunately for us this case is neutrally stable. 

The Earth’s wobble doesn’t go away, at the same time it doesn’t get worse. The spin axis 
passes about five meters from the North Pole. 


3.3. Linearization and Stability in 2D and 3D 179 


Flattened sphere dx/dt= 0 Critical points (+1,0,0) at Poles 
Square book dy/dt = —xz Critical plane (0, y, z) 
Two equal axes dz/dt = xy (the plane of the Equator) 


The partial derivatives of —xz and wy are quick to compute at (X,Y, Z) = (1,0,0): 
0 0 O 
A=]/]0 0-1 has eigenvalues A =i and A= —i and A\=0 
Ouse” 0 


The path of x,y,z is a circle around the North Pole (for the nonlinear equations too). 
The Earth wobbles as it spins, but it stays stable. Not like a tumbling box. 


Epidemics and the SIR Model 


An epidemic can spread until a serious fraction of the population gets sick—or the epidemic 
can die out early. Unstable or stable: always the important question. Suppose it is a flu 
epidemic on a closed campus (with no flu shots). The population divides into three groups : 


S =Susceptible (may catch the flu) 
I = Infected (sick with the flu) 
R = Recovered (after having the flu) 


The equations for S(t), I(t), R(t) will involve an infection constant 6 and a recovery con- 
stant a. The infection rate is G.S7, proportional to the susceptible fraction S times the in- 
fected (and infectious) fraction J. The recovery rate is simply al. This simple model has 
been improved in many ways—SJR is now a highly developed technique. Epidemiology 
has major importance, and we want to present this small model : 


dS/dt = —BSI = f(S,I) 
dI/dt = BSI —al =9(S,1) 
dR/dt= al 


We work with fractions of the total population, so S + J+ R = 1. Adding the equations 
confirms that S + J + R is constant (their derivatives add to zero). It is enough to study 
S and J. We are ignoring births and deaths—our system is closed and the epidemic is fast. 

The important critical point is S = 1,J = 0. The population is well, but everyone 
is susceptible. Flu is coming. Is that critical point stable if a few people get sick ? 


Of/dS Of /OI l= Bi —BS Cai 8 a m 


The eigenvalues of that matrix are 0 and 6 — a. We certainly need 6 < a for stability. 
“Sick must get well faster than well get sick.’ The other eigenvalue \ = 0 needs a closer 
analysis, and the model itself requires improvement. 
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A neutral eigenvalue like \ = 0 can be pushed either way by nonlinear terms. One way 
to establish nonlinear stability is to solve the equations—after removing t: 


Gy aide Wg by 1 fe ae 
dS “asja  —=pSr ~ 1+ 35 gives I= —S 4+ 7 seg 


The moving point travels along the curve J + S — (In. $)/8 = I(0) + S(0) — (In $(0))/8. 


An important fact about epidemics is the serious difficulty of estimating a@ and 6. Their 
ratio Ro = £/a controls the spread of disease: The epidemic dies out if Ro < 1. One 
comment about estimating 3: When the epidemic is over, you could compare J + S — 
(In S)/@ at t = 0 and t = oo. Much more is in the books by Brauer and Castillo-Chavez, 
especially Mathematical Models in Population Biology and Epidemiology. 


The Law of Mass Action 


When two chemical species react, the law of mass action decides the rate : 


dy _ s = concentration of S 
alba ae ree e = concentration of E 


This is like predator-prey and epidemics (multiply one population times the other, s times e). 
Then y is the concentration of S—. When E is an enzyme, there is also a reverse reaction 
SE — S + E anda forward reaction SE — P+ E. For a chemist, the desired product is 
P. For us, there are three mass action laws with rates ki, k_1, ko: 


; ze ve dy 
= = kyse — k_yy — koy t = —k,se+k_1y a = hse + kay + kay = — 3 


5 


Life depends on enzymes: Very low concentrations e(0) << s(0) and very fast reactions. 
Without FE, blood would take years to clot. Steaks would take decades to digest. This 
math course might take a century to learn. The enzyme is the catalyst (like platinum in a 
catalytic converter). 

After the fast reaction that uses FE, the slower reactions bring the enzyme back. Beauti- 
fully, separating the two time scales leads to a separable equation for y : 


d 
Michaelis-Menten equation oe eu (14) 


dt ytK 


Maini and Baker have shown how matching fast time to slow time leads to (14). 

This is just one example of the nonlinear differential equations of biology. Mathematics 
can reveal the main features of the solution. For a detailed picture we turn to accurate nu- 
merical methods—and those come in the next section. 
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Continuous Chaos and Discrete Chaos 


This section about stability will now end with extreme instability : Chaos. For this we need 
three differential equations (or two difference equations). Chaotic problems are a recent dis- 
covery, but now we know they are everywhere: Chaos is more common than stable equations 
and even more common than ordinary instability. 

This is a deep subject, but you can see its remarkable features from simple experiments. 
Here are suggestions for one equation, then two, then the big one (Lorenz) : 


1. 


Newton’s method on page 6 finds square roots by solving f(x) = 22 —c = 0. 
Compute xj, then x2, then x3, ... Then x, approaches +,/c. 


f (tn) =~ B83 (2,44). 


In = In — 
a f"(@n) Qn 2 In 


But if c = —1, these real x’s cannot approach the imaginary square roots x = +1. 
The x, will move around wildly when zn41 = 3 Dye ap Try 100 steps from 


xo = V3 and 2p = 2. 


. The Hénon map approaches a “strange attractor” in the xy plane: 


Stretching and folding 2,4; =1+ Yn —- 1.422 and yn = 0.32n 


Try four steps, starting from many different x9, yo between —1 and 1. 


. The Lorenz equations arise in trying to predict atmospheric convection and weather: 


xz’ =a(y—2) y'=a2(b—z)-y z' = ay —cz 
Lorenz himself chose a = 10, b = 28, c = 8/3. The system becomes chaotic. The so- 
lutions are extremely sensitive to changes in the starting values. Harvey Mudd College 
has an ODE Architect Library that includes Lorenz and suggests great experiments. 
Try it! 


= REVIEW OF THE KEY IDEAS #® 


. The critical points of y’ = f(y, z),z’ = g(y, z) solve f(Y, Z) = g(Y, Z) = 0. Steady 


state y(t) = Y, z(t) = Z. 


. Near that steady state, f(y,z) ~ (Of /Oy)(y — Y) + (Of/0z)(z — Z). Similarly 


g(y, z) is “linearized” at Y, Z. These derivatives of f and g goina 2 x 2 matrix A. 


. The equations (y,z)’ = (f,g) are stable at Y, Z when the linearized equations 


(y—Y,z—-—Z)' = A(y —Y,z — Z) are stable. Then \; and Az have real parts < 0. 
Of | 9g OfOg _ Of Og 


. Stability at Y,Z requires — + =~ < 0 and << > ——. This means that 


Oy Oz Oy Oz Oz Oy 
the eigenvalues have A; + Ag =a+d < Oand \,A2q = ad — bc > 0. 


182 Chapter 3. Graphical and Numerical Methods 


5. Boxes and books tumble unstably around their middle axes. Footballs are neutral. 


6. Epidemics and kinetics are nonlinear when species 1 multiplies species 2:y/ = kyz. 


Problem Set 3.3 


If y’ = 2y + 3z + 4y? + 52? and z’ = 6z + Tyz, how do you know that Y = 0, 
Z = O is a critical point ? What is the 2 by 2 matrix A for linearization around 
(0,0) ? This steady state is certainly unstable because _____ 


In Problem 1, change 2y and 6z to —2y and —6z. What is now the matrix A for 
linearization around (0, 0) ? How do you know this steady state is stable ? 


The system y’ = f(y,z) = 1-—y? —z, z' = g(y,z) = —5z has a critical point 
at Y = 1, Z = 0. Find the matrix A of partial derivatives of f and g at that point: 
stable or unstable ? 


This linearization is wrong but the zero derivatives are correct. What is missing ? 
Y =0, Z = 0is nota critical point of y’ = cos (ay + bz), z’ = cos (cy + dz). 


y’ | _ | -asin0 —bsinOd (70m (mes LC y 
z' | | —esin0 —dsinO Zalp i wOx0 Bale 
Find the linearized matrix A at every critical point. Is that point stable ? 
/ geese 3 
y =1—yz Say a 
(a) zgi=y—23 (b) zia=yted 
Can you create two equations y’ = f(y, z) and z’ = g(y, z) with four critical points : 
(1,1) and (1, —1) and (—1, 1) and (—1,-1)? 


I don’t think all four points could be stable ? This would be like a surface with 
four minimum points and no maximum. 


The second order nonlinear equation for a damped pendulum is y” + y’ + siny = 0. 
Write z for the damping term y’, so the equation is z’ + z + siny = 0. 


Show that Y = 0, Z = 01s a stable critical point at the bottom of the pendulum. 
Show that Y = 7, Z = 0 is an unstable critical point at the top of the pendulum. 


Those pendulum equations y’ = z and z’ = —sin y — z have infinitely many critical 
points ! What are two more and are they stable ? 


The Liénard equation y” + p(y)y’ + q(y) = 0 gives the first order system y’ = z and 
i . What are the equations for a critical point ? When is it stable ? 


10 Are these matrices stable or neutrally stable or unstable (source or saddle) ? 


Lee ee Wiese 
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11 


12 


13 


Suppose a predator z eats a prey y that eats a smaller prey z: 


dx/dt = —x + xy Find all critical points X,Y, Z 
dy/dt =—xy+y+yz Find A at each critical point 
dz/dt = —yz+2z (9 partial derivatives) 


The damping in y” + (y’)> + y = O depends on the velocity y’ = z. Then 
z’ + 2% + y = 0 completes the system. Damping makes this nonlinear system 
stable—is the linearized system stable ? 


Determine the stability of the critical points (0,0) and (2, 1): 
/ ! 2 
=-yt4z+ yz =—-y* +4z 
y y y (b) y y 


2) z’ = —y—2z 4+ 2yz z’=y-— 2x4 


Problems 14-17 are about Euler’s equations for a tumbling box. 


14 


15 


16 


17 


The correct coefficients involve the moments of inertia [,, [2,/3 around the axes. 
The unknowns z, y, z give the angular momentum around the three principal axes: 


dx /dt = ayz with a= (1/I3 —1/I2) 
dy/dt = bxz wih 6=(1/h —1/Is) 
dz/dt = cry with c= (1/l2—1/h). 


Multiply those equations by «, y, z and add. This proves that x? + y? + z? is 


Find the 3 by 3 first derivative matrix from those three right hand sides f, g, h. 
What is the matrix A in the 6 linearizations at the same 6 critical points ? 


You almost always catch an unstable tumbling book at a moment when it is flat. 
That tells us: The point x(t), y(t), z(t) spends most of its time (near) (far from) 
the critical point (0, 1,0). This brings the travel time ¢ into the picture. 


In reality what happens when you 


(a) throw a baseball with no spin (a knuckleball) ? 
(b) hit a tennis ball with overspin ? 

(c) hit a golf ball left of center ? 

(d) shoot a basketball with underspin (a free throw) ? 


184 Chapter 3. Graphical and Numerical Methods 


3.4 The Basic Euler Methods 


For most differential equations, solutions are numerical. We solve model equations to 
understand what to expect in more complicated problems. Then the numbers we need— 
close to exact but never perfect—come from finite time steps At. 

This section will show you the key ideas. The approximations will be simple and clear, 
but not highly accurate. The next section comes closer to the reality of modern codes. 
The Runge-Kutta method is still frequently used, with refinements that those two creators 
certainly did not anticipate. The cycle of predicting at t + Adz, correcting at t + At, and 
adjusting the stepsize Az for the next step is now highly developed. 

Local accuracy comes from small steps, but speed comes from larger steps. The right bal- 
ance depends on the particular equation and the user’s need for accuracy. Always 
there is a requirement of stability—because small errors are unavoidable. But after the 
numerical errors enter the calculation, they must not grow faster than the solution itself. 


Euler’s First Step yj = yo + At fo 


The equation to solve is dy/dt = f(t, y). The initial value y(0) is given—this will be our 
starting yo. A difference equation will go forward to y;. That is our approximation to the 
exact solution at t; = At (the end of the first time step and the start of the next step). 
By going forward in steps of size At,, Atg,... we compute values y), yo, ... that are close 
to the exact solution. 

We know two facts at t = 0. The value of y is yo and the slope dy/dt at that point 
is given by f in the equation. That slope is called fo. It is the right side f(t, y) when 
y = yo andt = O. With value yo and slope fo, we know the tangent line y = yo + tfo 
to the curve y(t). So we can take a step At along that tangent line—not too large a step 
or we will wander too far from the exact curve y(t). 


Step At along tangent line yi = yo + At fo (1) 


Figure 3.10 shows y, for the model equation y’ = 2y. At yo = 1 the slope is fo = 2 
(since f(y) = 2y). We follow that tangent line as far as y; = 1 + 2At. 

y= e2at 

on the solution curve 


yi = 1+ 2dt 
on the tangent line 


yo=le 


0 dt t 
Figure 3.10: The tangent line y = yo + tfo starts at yo. Euler stops at y1 = yo + Atfo. 
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Euler’s Method y, 4.1 = yn + Atfn 


On the graph, we are following pieces of tangent lines. This is the same as approximating 
the derivative dy/dt (which changes during a time step) by the forward difference Ay/At 
(which is held constant during a time step) : 


d es 
= = f(t,y) becomes Oo ote (2) 
There is a new tangent line for the second time step. That step starts at y; (which we just 
computed). The slope at that point in time is f; = f(At, yi). We are using the differential 
equation y’ = f(t, y) to tell us the slopes fo, f1, fo, ... at the start of every time step: 


a =. 
n'P time step —Y = fF (tn, yn) isEuler’smethod 2"**— 


At a0 


The model equation dy/dt = 2y has the exact solution y(t) = e?¢. Euler’s method 


Yn+1 = Yn + At fy will multiply y, at every step by the number 1 + 2At: 
Yn+1 = Yn + At(2yn) = (1+ 2At)yn leads to Yn = (14+ 2At)"yo. (4) 


We have seen powers of (1 + +) and (1 + ©) in Section 1.3 from compound interest. 
The current balance was yy and the interest at rate a was aAty,. Then the new balance was 
Ynti = (1+ aAt)yn. This is exactly Euler’s method to solve dy/dt = ay, and our example 
has a = 2. 


Approximating e7¢ Un = (1+ 2At)” w (e74*)” = e7"4*, (5) 


The errors yn — y grow as n increases. But the errors at each step also shrink as At + 0. 
If we hold n At fixed at some value 7’, then we are taking n steps to reach that time 7’. 
As n increases and At decreases, the steps are smaller—the tangent lines stay closer. 


Then Euler’s y,, approaches the exact y(T’) = e?”.. 


Euler’s Error 


The error E,, is y(n At) — yn. This is the exact solution minus the computed solution yn, 
at time n At. It comes from accumulating small errors at every time step—the tangent 
lines move away from the true graph of y(t). 

First, estimate those small errors at the n separate time steps. How far is a tangent line 
from a curve, after a step At? The answer comes from calculus. 


Local error a j 1 ae 
[ocal error ue + At = lt) +Aty')+ SAN" +-- © 


When we keep two terms and omit the third term, the error is < (At)? |y”\max. 
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ya = (1+ 2 new At)4 


yo=l 


e 2° a 2. e 


f O new At old At = 4 (new At) : 
Figure 3.11: Euler’s method converges to y(T’) as n — oo, with n steps of size At = T/n. 


The Mean Value Theorem would establish that bound of order (At)?. This is the error 
in one step—a tangent line moving away from the curve. We will take n steps to reach 
the time n At = T. If all goes well, the 1-step error C(At)? grows in n steps to CT'At. 


The error at time T after n steps is |y(T) — yn| < Cn(At)? = CTAt. (7) 


Conclusion: Euler’s method is first-order accurate. The error is proportional to At. 
If we take 2n steps of size At/2, and do twice as much work, that will divide the error 
by 2 (approximately). This is really minimum accuracy. 

The Runge-Kutta method has error proportional to (At)*. Then reducing At to At /2 im- 
proves the error by a factor near 16. We will be matching many more terms in the 
Taylor series, where Euler only matched the first derivative. In the example y’ = 2y, we 
know that y(T) = e?7: 


EN C 
First-order accuracy (1+2 At)” = (: + =) =e?! with error ei (8) 


This table shows the slow improvement as n increases, compared to the superfast 
improvement from keeping more terms in the Taylor series : 


n (1 + +) ” from Euler Taylor series for e 
1 2.0000000 2.0000000 
2 2.2500000 2.5000000 
3 2.3703704 2.6666667 
4 2.4414062 2.7083333 
5 2.4883200 2.7166667 
6 2.5216264 2.7180556 
7 2.5464997 2.7182540 
8 2.5657845 2.7182788 
9 2.5811748 2.7182815 

10 2.5937425 2.7182818 
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Stability 


We jumped over an important point when we converted n local errors of size (At)? 
to one global error of size At. The local errors occur in each step. The global error 
at Tis the composite of n local errors. We assumed that local errors at early times 
would not grow much before the final time 7’. 

Think of the local error as a small bank deposit every day. The global error at the end of 
a year (I’ = 365 At) includes 365 small errors. Those small deposits should grow during 
the year (they earn interest too). The constant C’ in equation (8) allows for this growth. 

What if the equation is dy/dt = —100y? This shows decay, not growth. The solu- 
tion starting at y(0) = 1 is y(T) = e— 1°97, very small. But does Euler’s method show the 
same fast decay in the approximate solution, when the equation has f, = —100y,,? 


Ynt1 = Yn + Atfn =(1—100 At)yn Yn = (1—100 At)" yo (9) 


If 100A¢ is small, then 1 — 100At is less than 1 and its powers decay as they should. 
But we will have 100At = 3 when At = 0.03. That step seems small but it is not. 
The number 1 — 100At will be —2. Equation (9) shows that every step multiplies by —2. 
The powers of —2 grow exponentially ! 


Yn = 1,-2,4,-8,... Yn = (1 — 100At)"yo = (—2)” yo is exponentially unstable. 


Conclusion : Stability for y’ = —100y requires |1 — 100At| < 1. We need At < 2/100. 

In a way this limit on At is acceptable. Euler is missing the 4(100At)? term in the 
Taylor series for e~ 19°’. We would want 100At¢ < 1 just for reasonable accuracy. The 
stability requirement 100A¢ < 2 is not a heavy burden. But read further. 


Stiff Equations 


Imagine an equation with solutions e~* and e~ 10%. Then e~* will dominate, because 


it has much slower decay than e~1°°*. We have decay rates s = —1 and s = —100: 
y” +10ly’+100y=0 with s?4+101s+100=(s+1)(s+100). (10) 


This is certainly overdamped. The roots s = —1 and s = —100 are real. Euler’s method 
needs to follow e~* accurately, because that is the important solution. But stability still 
requires At < 2/100. 

The unimportant solution e~ 1 is getting in the way. It reduces At and therefore adds 
more work (many steps), beyond the ordinary demand of first order accuracy. A problem 
like equation (10) is called stiff: stability can be too expensive for ordinary Euler. 

We can see this second order problem as two first order equations. Introduce y’ as a 
second unknown. As in Section 3.1, a “companion matrix” multiplies the vector (y, y'): 


: d 0 1 
" ! r=, Seal | eee — 7] 
y +10ly’ + 100y = 0 is the same as di aie = 100 —101 ie |. (11) 


The eigenvalues of the matrix are the same roots —1 and —100. That is a stiff problem: 
slow decay together with fast decay. 
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Euler’s method for this matrix equation is just like Euler for y’ = Ay: 


Seth = Ay, Of Yay = (E+ AALD,. (12) 


Every step multiplies by J + AAt. That matrix has eigenvalues 1 — At and 1 — 100At. 
Normally 1 — At is more important and larger. But if 100A¢ is greater than 2, then the 
second number 1 — 100At is below —1. Its powers will show extreme instability. 

The cure for stiff systems is to switch to an implicit method. 


Backward Euler = Implicit Euler 


The idea of implicit methods is to use backward differences. Go back from y,+1 and 
tn+1 and f,+1, instead of going forward from y, and t, and fy. 


Vows ~Y 


ni _ B 
Backward Euler oe = fnti = f(tn415Yn41)- (13) 


The example y’ = —100y will divide by 1 + 100At instead of multiplying by 1 — 100At: 
Ynti— Yn _ 100 y2,, is (1+ 100At)y?,, = yn. 


That division happens at every time step. After n steps this method remains very stable: 


nm 
“Implicit Euler” ye = Cesc yo is decreasing correctly. 


For this linear equation, division is no more expensive than multiplication. Implicit is 
the way to go. But we pay a much higher price for implicit when the problem is nonlinear. 
Instead of substituting the known y,, to find f, = f(n At, yn) in ordinary “explicit” Euler, 
we now have to solve a nonlinear equation to find the unknown y? “Te 


Each step must solve for Un Gees Atf (tn41, y241) = Yn- (14) 


If the forcing function f is complicated, even an approximate solution for y? 41 Will be 
expensive. You see the struggle that is constantly presented: Implicit methods are more 
stable but much slower. For y’ = Ay, the matrix to invert is in (I — At A)y®,, = yp. 


Difference Equations vs Differential Equations 


Compare a” with e*! : powers and exponentials. The powers come from a difference equa- 
tion Yn11 = aYn. The exponentials come from a differential equation y’ = ay. Stability 
means that those solutions approach zero. For ordinary numbers (this includes complex 
numbers) the test on a is easy. 


a” + 0 when |a| <1 e%t _, 0 when Rea < 0. 


When we have a matrix A, the same tests are applied to the eigenvalues : 


A” — 0 whenall |A| <1 e4t _, 0 when all Re A; < 0. 
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1. Euler’s method is (yn41 — yn)/At = fn OF Yn41 = Yn + At f(n At, yn). 

2. That step to yn +1 follows the tangent line at y,,, not the curve y(t). Error ~ (At)?. 

3. After n steps to time T’ = n At, the error is proportional to At: First order accuracy. 
4. Stability requires y,, to grow no faster than the exact y(t) : Often a size limit on At. 


5. Backward Euler is y?,, — yn = Atf(y2,,). Harder to find y?,, but more stable. 


Problem Set 3.4 
1 Apply Euler’s method yn+1 = Yn + Atfn to find y: and y2 with At = 3: 


y=, Ib)ay=a7 “Ory Holy (all with y(0) = yo = 1) 


2 For the equations in Problem 1, find y; and yz with the step size reduced to At = 
Now the value yo is an approximation to the exact y(t) at what time 
Then yz in this question corresponds to which y, in Problem 1 ? 


i 
a 
t? 


3 (a) For dy/dt = y starting from yo = 1, what is Euler’s y,, when At = 1? 
(b) Is it larger or smaller than the true solution y = e¢ at time t = n? 


(c) What is Euler’s y2, when At = 4 ? This is closer to the true y(n) = e”. 


4 For dy/dt = —y starting from yo = 1, what is Euler’s approximation y,, after n steps 
of size At? Find all the y,,’s when At = 1. Find all the y,’s when At = 2. Those 
time steps are too large for this equation. 


5 The true solution to y’ = y? starting from y(0) = 1 is y(t) = 1/(1 —t). This 
explodes at t = 1. Take 3 steps of Euler’s method with At = ; and take 4 steps 
with At = i Are you seeing any sign of explosion ? 


6 The true solution to dy/dt = —2ty with y(0) = 1 is the bell-shaped curve y = er eit 
decays quickly to zero. Show that step n + 1 of Euler’s method gives 
Ynti = (1 — 2nAt?)yn. Do the yn’s decay toward zero? Do they stay there? 


7 The equations y’ = —y and z’ = —10z are uncoupled. If we use Euler’s method for 
both equations with the same At between 4 and 2, show that y,, + 0 but |z,,| > co. 


The method is failing on the solution z = e~ 1° that should decay fastest. 


8 What values y; and y2 come from backward Euler for dy/dt = —y starting from 
yo = 1? Show that yP < land yf < leven if At is very large. We have absolute 
stability : no limit on the size of At. 
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9 The logistic equation y’ = y — y? has an S-curve solution in Section 1.7 that 
approaches y(co) = 1. This is a steady state because y’ = 0 when y = 1. 


Write Euler’s approximation yp,41 = to this logistic equation, with stepsize 
At. Show that this has the same steady state: yn+1 equals yy if y, = 1. 


10 The important question in Problem 9 is whether the steady state y, = 1 is stable 
or unstable. Subtract 1 from both sides of Euler’s yn41 = Yn + At(Yn — y2): 


Ynt1 ~1=Yn + At(Yn > y2) -l= (Yn — 1)(1 = Atyn). 


Each step multiplies the distance from 1 by (1 — Aty,,). Near the steady y.. = 1, 
1 — At yn has size |1 — At|. For which At is this smaller than 1 to give stability ? 


11. Apply backward Euler y?,, = yn + Atf2,, = yn + At Ea — Ca to the 
logistic equation y’ = f(y) = y — y’. What is yP if yo = 5 and At = 4? 


You have to solve a quadratic equation to find y?. I am finding two answers for y?. 
A computer code might choose the answer closer to yo. 


12 For the bell-shaped curve equation y’ = —2ty, show that backward Euler divides 
Yn by 1 + 2n(At)? to find y?,,. Asn — oo, what is the main difference from 
forward Euler in Problem 6? 


13. The equation y’ = \/|y| has many solutions starting from y(0) = 0. One solution 
stays at y(t) = 0, another solution is y = t?/4. (Then y’ = t/2 agrees with \/y.) 
Other solutions can stay at y = 0 up tot = Ty, and then switch to the parabola 
y = (t —T)?/4. As soon as y leaves the bad point y = 0, where f(y) = y/? 
has infinite slope, the equation has only one solution. 


Backward Euler y; — At,/|yi| = yo = 0 gives two correct values y? = 0 and 
yP = (At)?. What are the three possible values of y? ? 


14 ‘Every finite difference person will think of averaging forward and backward Euler: 
: 1 al 
Centered Euler /Trapezoidal = yt, | — yn = At €E fnt+ a as -) 


For y’ = —y the key questions are accuracy and stability. Start with y(0) = 1. 


1—At/2 


1 1 
c c Cc 
= At ives = : 
Y1 — Yo ( Yo i") g Yi 1+At/2 Yo 


2 2 
Stability Show that |1 — At/2| < |1 + At/2| for all At. No stability limit on At. 


Accuracy For yo = 1 compare the exact y; = e~*’ = 1— At+ + At? ee 
with y = (1 — 3At)/(1 — gAt) = (1 — FAt)(1 — $At + $At? ----). 


An extra power of At is correct: Second order accuracy. A good method. 


The website has codes for Euler and Backward Euler and Centered Euler. Those 
methods are slow and steady with first order and second order accuracy. The test problems 
give comparisons with faster methods like Runge-Kutta. 
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3.5 Higher Accuracy with Runge-Kutta 


The section on basic Euler methods contained two messages. First, those methods are 
simple to understand (they follow a tangent line). Second, those methods are too simple 
to give good or even adequate accuracy. This section brings major improvements. The 
fourth order Runge-Kutta method is the basis for ode45, the workhorse among all of 
MATLAB’s codes for solving y’ = f(t, y). 

Notice that this equation—linear or more likely nonlinear—involves first derivatives y’ 
and no higher derivatives. In case the original equation is y” = F(t,y,y’), introduce 
y’ = Ye as a new equation together with the original yj = F(t, y, y2). The unknowns 
y1 = yand yp = y’ go into a vector y. The right hand sides y2 and F go into a vector f. 


n equations for Yi, = Yo 
n unknown y‘s yo = F(t, m1, y2) 


In the middle is a system of two equations coming from y” = F’. On the right is a system 
of n equations for the vector y of n unknowns. The n equations y’ = f(t, y) start from n 
initial conditions y;(0), ..., %(0), and f is a vector of n right hand sides. 

We are ready for more accurate approximations to y’ = f(t, y) and y’ = f(t, y). 


Improved Euler = Simplified Runge-Kutta 


Euler’s first order method is y” ‘41 = Yn + Atfy. Let me describe an improvement to sec- 
ond order accuracy, which means an error of size (At)?. This uses the Runge-Kutta idea: 
Substitute Euler’s y” ‘1 once more into f. Use that output to get a better y? 441! 


Ss 
Improved Euler Yn41 — Yn _ 1 1 EB 
Simplified RK ~ ag — g/ (m9) + of (teras¥E rs) (1) 


Let me show you the improvement for y’ = ay. In this case f(t, y) is ay. You can see y” as 
a prediction of the next value yn; and y® as a correction: 


1 1 
y= =yntadAtyn goesinto yS = y+ 32 At yn + 3% At(yn + aAtyn). (2) 
When that last term is multiplied out, we see the correct (At)? term included in y?, , : 


Linear case y’ = ay Tete = yn t+ a At yn + 507(At) un. (3) 
We are following the tangent parabola starting at y,. The parabola stays much closer to the 
true y(t) curve than the tangent line. This improvement means a (At)? error at each step. 
With stability, those errors produce a (At)? overall error after n = T'/At steps. 
The exact y(t + At) is e?*y(t). Equation (3) has three correct terms of e®4'. Euler 
uses the slope y’ = f(t,y) only at the start of the time step, but the improvement y* 
in equation (1) averages the slope at the start and the end of the step. 


aAt 
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Simplified Adams Method 


Here is another way to achieve second order accuracy. Save and reuse the computed 
value y,—1 at the previous time t — At. With the right coefficients 3/2 and —1/2, 
and essentially no extra work, we can again capture the term $(At)?y’ ’ that Euler missed. 


Adams-Bashforth a. 3 1 
Multistep method Ynt1 = Yn + Ath (tn, Yn) — 9 Atf(tn—1 Yn—t). 3) 


All we do is to save each computed value of f,, for one more step. That number becomes the 
fn—1 term in (4). The right hand side of (4) gives the correct y’ and y” terms: 


3 1 3 1 1 
Yn t+ 5Atun— 5Atun—1 Yn 5 Attn 5 At(un Aty,’) = yn + Atyf + 3 (At) un 


Each extra step back to Yn—2, Yn—3, -. can increase the accuracy by 1. Those multi-step 
methods compete with Runge-Kutta and eventually they win. But fourth order is still mostly 
on the R-K side. One reason is that Adams needs a special effort to compute y_; before the 
first step can begin. Runge-Kutta starts cold. 

Runge-Kutta easily changes At from one step to the next. On the other hand, its four 
evaluations of f(t, y) could be expensive. Stiff systems need backward differences. 


Fourth Order Runge-Kutta 


The famous version of Runge-Kutta uses four evaluations of the right side. It starts at time 
tn with solution y#*. It reaches time tn4, = tn + At with approximate solution gets 
On the way, Runge-Kutta stops twice for kg and kg at ty41/2 = tn + At. 


At each step = f(tn,yn)/2 

from ty, to tn441 = f(tntij2,yn + Atky)/2 
compute (tr41/2sYn + At kg)/2 
ki, ka, k3, ka hte, Un + 2At kg) /2 


A combination of those four k’s gives fourth-order accuracy for eri : 


plots —YUn 1 
Runge-Kutta step en aes = gk + 2k2 + 2k3 + ka) 


That short line is one of the most important formulas in this book. Among highly accu- 
rate methods, Runge-Kutta is especially easy to code and run—probably the easiest there is. 
Before each step, we decide on At. For the model problem y’ = y the R-K combination pro- 
duces five correct terms in the series for e4’. You can see evaluations of f inside evaluations 
of f, starting with ky = f,/2 = y/2: 


1 At 1 At At 1 At At 
ka=35 (vt) ka=3(v+ 9 (u+ 5 i) ka=5(v+ae(u+ (0+ Ss) 


Problem 1 will simplify ki + 2k2 + 2k3 + ka. The new yn+1 at the end of the step is 
Ynti = (1+ At+---+ 4 (At)4) yn. All terms correct for e“* and 4" order accuracy. 
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The Stability of Runge-Kutta 


To determine the limit of stability, apply the method to y’ = —y. The true solution y = 
e ‘y(0) will decrease. But if At is too large, the approximations y,, will increase in size. 
The first example of possible instability was Euler’s method: 


Euler instability for At > 2) y®,, =(1— At)yn has |1— At] >1 


When we apply the same test to Runge-Kutta, instability enters for At > 2.78: 


1 1 1 11 
RK instability for At > 3 1-—3+ -9-— =-27+ —8l1l= — > 1. 
instability for Zz T 5 6 ay 8 


The full infinite series would give the small number e~°. But these five terms give 


a multiplier 11/8 that is larger than 1. If we take this over-large step n times, the 
Runge-Kutta approximation y, = (11/8)” will be enormous and completely wrong. 
The more exact stability limit is a At < 2.78 for y’ = ay. 


Example 1 Apply all three methods to dy/dt = y. The true solution y = ef reaches 


y = e = 2.71828... at time t = 1. Try At = 0.2 and 0.1. 

At = 0.2 y= y* pt At = 0.1 ye y* gs 
t= 0 1 1 1 10) 1 1 1 
1.10 1.1050 1.1051708 
1.21 1.2210 1.2214026 
1.33 1.8492 1.3498585 
1.46 1.4909 1.4918242 
1.61 1.6474 1.6487206 


1 
a 1.20. 1.220 = 1.221400 2 
3 
4 
3) 
t=.6 1.73 1.816 1.822106 6 1.77 1.8204 1.8221180 
7 
8 
9 
0 


t=A 1.44 1.488 1.491818 


1.95 2.0116 2.0137516 
2.14 2.2228  2.2255396 
2.36 2.4562 2.4596014 
2.59 2.7141 2.7182797 


t=.8 2.007 2.215 2.225521 
t=1 2.49 2.703 2.718251 dis 


The error in y* is divided by 4 (from .015 to .004 at t = 1) when At is cut in half. 
This indicates second order accuracy for simplified Runge-Kutta, as the theory predicted. 
The work is only doubled. 


ode 45 and ODEPACK and More 


Runge-Kutta is accurate and easy to code. The final value y,,;; can be made even better. 
With six evalutions of f (not four) we can also compute a value Y,?,, that has fifth order 
accuracy. By comparing with yee we get an estimate of the error, which indicates whether 
a larger At is possible or a smaller At is necessary. This is the heart of Matlab’s ode 45 code. 
A good solver for stiff systems is ode 15s. 

ODEPACK and SUNDIALS are open collections of Fortran 77 codes from Livermore 
Laboratory. Those emphasize Adams methods (backward differences for stiff problems). 
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Mathematica has DSolve for solution formulas and NDSolve for numerical solutions. 
Wolfram Alpha is remakable for the very wide range of problems it solves. SciPy and SymPy 
and Scilab are also free and high quality. See the web ! 


= REVIEW OF THE KEYIDEAS #8 


1. Higher order equations like y” + y’ + y = F(t,y,y’) reduce to y’ = f(t, y). 
Most finite difference methods prefer this first order system with y = (y, y’). 


eae ‘11 = Yn + Atf, improves to second order accuracy by also using f(tn+1, Yr): 
. Fourth order Runge-Kutta uses that substitution into f(t, y) four times in each step. 


. The Runge-Kutta error is divided by almost 24 = 16 when At is divided by 2. 


an &F Wo N 


. Stability for y’ = ay requires aAt® > —2 and aAt® > —2 and aAt®* > —2.78. 
Otherwise disaster for a < 0: the approximations Y,, will start to grow. 


Problem Set 3.5 


Runge-Kutta can only be appreciated by using it. A simple code is on math.mit.edu/dela. 
Professional codes are ode 45 (in MATLAB) and ODEPACK and many more. 


1 For y’ = y with y(0) = 1, show that simplified Runge-Kutta and full Runge-Kutta 
give these approximations y; to the exact y(At) = e*: 


yo =14+At+s(At)? PK =14At+ 5(At)? + (At) +: mC 


2 With At = 0.1 compute those numbers y? and y{** and subtract from the exact 
y =e". The errors should be close to (At)?/6 and (At)>/120. 


3 Those values y{’ and y/** have errors of order (At)? and (At)>. Errors of this size at 
every time step will produce total errors of size _____—s and at time 7’, from N 
steps of size At = T/n. 


Those estimates of total error are correct provided errors don’t grow (stability). 


4 dy/dt = f(t) with y(0) = 0 is solved by integration when f does not involve y. 
From time t = 0 to At, simplified Runge-Kutta approximates the integral of f(t) : 


r rh f(At) 
y 0 
yy = At (50) + 5f(at)) is close to y(At) = [seat 

0 At 
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Suppose the graph of f(t) is a straight line as shown. Then the region is a trapezoid. 
Check that its area is exactly y?. Second order means exact for linear f. 


5 Suppose again that f does not involve y, so dy/dt = f(t) with y(0) = 0. Then full 
Runge-Kutta from t = 0 to At approximates the integral of f(t) by yP* : 


ye = At (er f (0) + cof (At/2) + c3f(At)). Find cy, €2, C3. 
At 
This approximation to { f(t) dt is called Simpson’s Rule. It has 4" order accuracy. 
0 


6 Reduce these second order equations to first order systems y’ = f(t, y) for the vector 
y = (y,y’). Write the two components of y? (Euler) and y?. 


(a) y”+yy’+yt=1 — (b) my” + by’ +ky =cost 


7 When my” + by’ + ky = cos t in Problem 6 is reduced to a vector equation y’ = 
Ay + f find y? and y? from the initial vector yo. 


8 For y’ = —y and yo = 1 the exact solution y = e~* is approximated at time At by 2 
or 3 or 5 terms: 
1 


«1 . 
g (At)? +5 (At) 


. : 1 : 1 

yf =1-At yS = 1—At+5(At)? y PK — 1—At+5(At)?— 
(a) With At = 1 compare those three numbers to the exact e—+. What error F? 
(b) With At = 1/2 compare those three numbers to e~!/?. Is the error near E /16? 


9 For y’ = ay, simplified Runge-Kutta gives y2,, = (1 + aAt + $(aAt)?)yn. 
This multiplier of y, reaches 1 — 2 + 2 = 1 when aAt = —2: the stability limit. 


(Computer experiment) For N = 1,2,...,10 discover the stability limit D = Ly 
when the series for e~” is cut off after N + 1 terms: 


1 1 1 
= a Dicey Dag ae eee eA Been 
Lo rates a 


We know L = 2 for N = 1 and N = 2. Runge-Kutta has L = 2.78 for N = 4. 
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= CHAPTER 3 NOTES #8 


Proof that y’ = f(t, y) has a solution Functions yo, yi, y2, ... approach y(t) 


Section 3.1 stated a fact: dy/dt = f(t, y) has one solution starting from y(0), when f 
is a good function: Assume f and df /dy are continuous at all points. Since we have no 
formula for y (and we don’t expect one), how can we know that a solution exists ? 

One good answer constructs y; from yo = y(0), then yz from yj, then y3 from ye, ... 


t 
Equation Wott = F(t, yn(t)) Solution ynii = yo + | f(8,yn(s))ds (6) 
0 


Let me practice with y’ = y and y(0) = 1. The solution is e*. Take three steps to ys: 


% =0 KW =Ho Yo = Y3 = Yo 
2 2 3 


t t t 
y(0)=1 y=1+t Lb Raa as ay a a eer 


The same construction of e’ was in Section 1.3. Now we go much further, to solve nonlinear 
equations y’ = f(t, y). The key idea is to compare Yn+1 — Yn With the previous yp, — Yn—1. 
Subtract equation (6) for y, from equation (6) for ynj+1: 


uni (t)—yn(t) = i [F(5,tm(8)) — F(6, Yn-a(s))] ds. (7) 


When |0f /dy| < L, the difference | f (yn) — f(Yn—1)| is not larger than L| yn — Yn-i|. 


t 

ly2—-mil < [ L\y1 — yo|ds < Ltlyi — yo\max 
0 
bs 


ets ie 
ly3 = yo| < i Lye = yi| ds < i L tlyy =a Yo|max = Sas = Yo|max 
0 0 
We are seeing Lt and L?t?/2 and next will be L°t?/6. Those numbers L"t”/n! approach 


zero quickly because of n! If n is large and N is larger, then 


yO a 
n! 


lyn — Yn| < lyw — yn—il + [yw—1 — yn—2| +--+ [yng — yn| < C 


This is what we need to know: the differences yy (t) — yn (t) approach zero. Cauchy showed 
that the numbers y,,(t) must approach a limit y(t). (Of course yy+1 will approach the same 
limit.) That limiting function y(t) will be our desired solution : 


t t 


aa a f(s, yn(s))ds + y(t) = yo + | f(s,y(s)) ds. Then y! = f(t,y). 


Chapter 4 


Linear Equations and Inverse 
Matrices 


4.1. Two Pictures of Linear Equations 


The central problem of linear algebra is to solve a system of equations. Those equations are 
linear, which means that the unknowns are only multiplied by numbers—we never see x” or 
x times y. Our first linear system is deceptively small, only “2 by 2.” But you will see how 
far it leads : 


II 


Two equations x — dy 1 


Two unknowns Qe + y= 7 (1) 


We begin a row ata time. The first equation 2 — 2y = 1 produces a straight line in the 
zy plane. The point z = 1, y = O is on the line because it solves that equation. The 
point xz = 3, y = 1 is also on the line because 3 — 2 = 1. For x = 101 we find y = 50. 

The slope of this line in Figure 4.1 is 3, because y increases by 1 when x changes 


by 2. But slopes are important in calculus and this is linear algebra! 


Figure 4.1: Row picture: The point (3, 1) where the two lines meet is the solution. 
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The second line in this “row picture” comes from the second equation 2” + y = 7. You 
can’t miss the intersection point where the two lines meet. The point x = 3,y = 1 lies on 
both lines. It solves both equations at once. This is the solution to our two equations. 


ROWS The row picture shows two lines meeting at a single point (the solution). 


Turn now to the column picture. I want to recognize the same linear system as a 
“vector equation.” Instead of numbers we need to see vectors. If you separate the original 
system into its columns instead of its rows, you get a vector equation: 


25 
Combination equals b | : oa v| 1 = | ; | =: (2) 


This has two column vectors on the left side. The problem is to find the combination of 
those vectors that equals the vector on the right. We are multiplying the first column by 
x and the second column by y, and adding vectors. With the right choices x = 3 and 
y = 1 (the same numbers as before), this produces 3(column 1) + 1(column 2) = b. 


COLUMNS’ The column picture combines the column vectors on the left side 
of the equations to produce the vector b on the right side. 


column 2 
2 2 | column 1 


—2-1 0 1 2 3 


Figure 4.2: Column picture : A combination 3 (column 1) + 1 (column 2) gives the vector b. 


Figure 4.2 is the “column picture” of two equations in two unknowns. The left side 
shows the two separate columns, and column 1 is multiplied by 3. This multiplication by a 
scalar (a number) is one of the two basic operations in linear algebra: 


Scalar multiplication 3 | = | | ‘ 
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If the components of a vector v are v1 and v2, then cv has components cv, and cv2. 
The other basic operation is vector addition. We add the first components and the second 
components separately. 3 — 2 and 6 + 1 give the vector sum (1, 7) as desired : 


Vector addition | : | + | “# = | : 
6 1 7 


The right side of Figure 4.2 shows this addition. The sum along the diagonal is the vector 
b = (1, 7) on the right side of the linear equations. 

To repeat: The left side of the vector equation is a linear combination of the columns. 
The problem is to find the right coefficients 7 = 3 and y = 1. We are combining scalar mul- 
tiplication and vector addition into one step. That combination step is crucially important, 
because it contains both of the basic operations on vectors: multiply and add. 


Linear combination 3 1 + oye tie eat es 
of the 2 columns 2 de evel oarte 


Of course the solution x = 3,y = 1 is the same as in the row picture. I don’t know 
which picture you prefer! Two intersecting lines are more familiar at first. You may like 
the row picture better, but only for a day. My own preference is to combine column vectors. 
It is a lot easier to see a combination of four vectors in four-dimensional space, than to 
visualize how four “planes” might possibly meet at a point. (Even one three-dimensional 
plane in four-dimensional space is hard enough. . .) 

The coefficient matrix on the left side of equation (1) is the 2 by 2 matrix A: 


: : ib 2 
Coefficient matrix A= Bee 4 : 
This is very typical of linear algebra, to look at a matrix by rows and also by columns. 
Its rows give the row picture and its columns give the column picture. Same numbers, 
different pictures, same equations. We write those equations as a matrix problem Av = b: 


Matrix multiplies vector : — = = 


The row picture deals with the two rows of A. The column picture combines the columns. 
The numbers x = 3 and y = 1 go into the solution vector v. Here is matrix-vector multipli- 
cation, matrix A times vector v. Please look at this multiplication Av ! 


Dot products with rows Se ce 1 -2 
Combination of columns > 2 1 
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Linear Combinations of Vectors 


Before I go to three dimensions, let me show you the most important operation on vectors. 
We can see a vector like v = (3,1) as a pair of numbers, or as a point in the plane, or 
as an arrow that starts from (0,0). The arrow ends at the point (3, 1) in Figure 4.3. 


v=[7| _ 
: ia = (0,0) 


column vector point (3,1 arrow to (3, 1) 


Figure 4.3: The vector v is given by two numbers or a point or an arrow from (0, 0). 


A first step is to multiply that vector by any number c. If c = 2 then the vector is doubled 
to 2v. If c = —1 then it changes direction to —v. Always the “scalar” c multiplies each 
separate component (here 3 and 1) of the vector v. The arrow doubles the length to show 2v 
and it reverses direction to show —v: 


column vectors arrows to (6, 2) and (—3, —1) 


Figure 4.4: Multiply the vector v = (3, 1) by scalars c = 2 and —1 to get cv = (3c, c). 


If we have another vector w = (—1,1), we can add it to v. Vector addition v + w 
can use numbers (the normal way) or it can use the arrows (to visualize v + w). The 
arrows in Figure 4.5 go head to tail: At the end of v, place the start of w. 


eel] x 


vu+w 


Figure 4.5: The sum of v = (3, 1) and w = (—1, 1) isu + w = (2, 2). This is also w + v. 


Allow me to say, adding v + w and multiplying cv will soon be second nature. In 
themselves they are not impressive. What really counts is when you do both at once. 
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Multiply cv and also dw, then add to get the linear combination cv + dw. 


Linear combination 2v + 3w 2 | : +3] “| = : : 

This is the basic operation of linear algebra! If you have two 5-dimensional vectors like 
v = (1,1,1,1,2) and w = (3,0,0,1,0), you can multiply v by 2 and w by 1. You 
can combine to get 2v + w = (5, 2,2,3,4). Every combination cv + dw is a vector in 
the big 5-dimensional space R°. 

I admit that there is no picture to show these vectors in R°. Somehow I imagine arrows 
going to v and w. If you think of all the vectors cv, they form a line in R°. The line 
goes in both directions from (0,0,0,0,0) because c can be positive or negative or zero. 

Similarly there is a line of all vectors dw. The hard but all-important part is to imagine 
all the combinations cv + dw. Add all vectors on one line to all vectors on the other line, 
and what do you get? It is a “2-dimensional plane” inside the big 5-dimensional space. 
I don’t lose sleep trying to visualize that plane. (There is no problem in working with the 
five numbers.) For linear combinations in high dimensions, algebra wins. 


Dot Product of v and w 


The other important operation on vectors is a kind of multiplication. This is not ordinary 
multiplication and we don’t write vw. The output from v and w will be one number and it 
is called the dot product v - w. 


DEFINITION The dot product of v = (v1, v2) and w = (wy, wa) is the number v - w: 
VW = UU] + V2W2. (4) 


The dot product of v = (3,1) and w = (—1,1)is v- w = (3)(—1) + (1)(1) = -2. 
Example 1 The column vectors (1,2) and (—2, 1) have a zero dot product: 


Dot product is zero aE 2 
Perpendicular vectors Z) 1 


In mathematics, zero is always a special number. For dot products, it means that these two 
vectors are perpendicular. The angle between them is 90°. 

The clearest example of two perpendicular vectors is 1 = (1,0) along the x axis and 
j = (0,1) up the y axis. Again the dot product is - 7 = 0 + 0 = 0. Those vectors ¢ and 7 
form a right angle. They are the columns of the 2 by 2 identity matrix J. 

The dot product of v = (3,1) and w = (1,2) is 5. Soon v - w will reveal the angle 
between v and w (not 90°). Please check that w - v is also 5. 
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Multiplying a Matrix A and a Vector v 


Linear equations have the form Av = b. The right side b is a column vector. On the left side, 
the coefficient matrix A multiplies the unknown column vector v (we don’t use a “dot” for 
Av). The all-important fact is that Av is computed by dot products in the 
row picture, and Av is a combination of the columns in the column picture. 

I put those words “combination of the columns” in boldface, because this is an essential 
idea that is sometimes missed. One definition is usually enough in linear algebra, but Av has 
two definitions—the rows and the columns produce the same output vector Av. 

The rules stay the same if A has n columns aj,...,@n,. Then v has n components. 
The vector Av is still a combination of the columns, Av = v1 a1 + vga2 + +++ + UnAn.- 
The numbers in v multiply the columns in A. Let me start with n = 2. 


(row 1)-v 


By rows Av = | (row 2) 0 


| By columns Av = v;(column 1) + v2(column 2). 


Example 2 In equation (3) I wrote “dot products with rows” and “combination of 
columns.” Now you know what those mean. They are the two ways to look at Av: 


Dot products with rows auy+bu2|_, | a ag b (5) 
Combination of columns cup +dvg | "1 ¢ lal: 


You might naturally ask, which way to find Av ? My own answer is this: I compute by 
rows and I visualize (and understand) by columns. Combinations of columns are truly funda- 
mental. But to calculate the answer Av, I have to find one component at a time. 
Those components of Av are the dot products with the rows of A. 


2 3 V1 2) 2v) Ss 32 a 2 ‘lies 3 
6 8). oe] | tee | ae) | BI 


Singular Matrices and Parallel Lines 


The row picture and column picture can fail—and they will fail together. For a 2 by 2 matrix, 
the row picture fails when the lines from row | and row 2 are parallel. The lines don’t meet 
and Av = bhas no solution: 


= 2 8 2v, — 3vo = 6 Parallel lines 
- Av, — 6v2 =0 no solution 


A 


The row picture shows the problem and so does the algebra: 2 times equation | produces 
4v; — 6vg = 12. But equation 2 requires 4v; — 6v2g = O. Notice that this line goes 
through the center point (0, 0) because the right side is zero. 
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How does the column picture fail? Columns 1 and 2 point in the same direction. 
When the rows are “dependent”, the columns are also dependent. All combinations of 
the columns (2, 4) and (3,6) lie in the same direction. Since the right side b = (6,0) is 
not on that line, b is not a combination of those two column vectors of A. Figure 4.6 (a) 
shows that there is no solution to the equation. 


line of columns line of columns 


b is on line 


-[ 


b not on line 


=| 


wb BSW RP UN A 


wb Ww RN A 


Figure 4.6: Column pictures (a) No solution (b) Infinity of solutions 


Example 3 Same matrix A, now b = (6, 12), infinitely many solutions to Av = b 


4.12 3 2v, — 3v2 = 6 
a 4v, — 6v2 = 12 


t t 


In the row picture, the two lines are the same. All points on that line solve both equations. 
Two times equation | gives equation 2. Those close lines are one line. 

In the column picture above, the right side b = (6,12) falls right onto the line of the 
columns. Later we will say: b is in the column space of A. There are infinitely many ways 
to produce (6, 12) as a combination of the columns. They come from infinitely many ways 
to produce b = (0,0) (choose any c). Add one way to produce b = (6, 12) = 3(2, 4). 


0 2 —3 6 2 —3 
Neeleleelse) danleslaleelenls =e 
The vector v, = (3c, 2c) is a null solution and v, = (3,0) is a particular solution. 


Avy, equals zero and Av, equals b. Then A(v, + v,,) = b. Together, vp and vy give the 
complete solution, all the ways to produce b = (6,12) from the columns of A: 


j 3 
Complete solution to Av = b Ycomplete = Vp + Un = 0 + ‘ (7) 
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Equations and Pictures in Three Dimensions 


In three dimensions, a linear equation like x+ y+ 2z = 6 produces a plane. The plane would 
go through (0,0,0) if the right side were 0. In this case the “6” moves us to a 
parallel plane that misses the center point (0, 0,0). 

A second linear equation will produce another plane. Normally the two planes meet in 
a line. Then a third plane (from a third equation) normally cuts through that line at a point. 
That point will lie on all three planes, so it solves all three equations. 

This is the row picture, three planes in three—dimensional space. They meet at the 
solution. One big problem is that this row picture is hard to draw. Three planes are too 
many to see clearly how they meet (maybe Picasso could do it). 

The column picture of Av = 6 is easier. It starts with three column vectors in three- 
dimensional space. We want to combine those columns of A to produce the vector 
vi(column 1) + ve(column 2) + v3(column 3) = b. Normally there is one way to do 
it. That gives the solution (v1, v2, vz) — which is also the meeting point in the row picture. 

I want to give an example of success (one solution) and an example of failure (no solu- 
tion). Both examples are simple, but they really go deeply into linear algebra. 


Example 4 _Invertible matrix A, one solution v for any right side b. 


1 0 O 
Av=b is —1 I Q V2 = (8) 
0 -1 1 


U3 


awe 


This matrix is lower triangular. It has zeros above the main diagonal. Lower triangular 
systems are quickly solved by forward substitution, top to bottom. The top equation gives 
v1, then move down. First v1 = 1. Then —v1 + vo = 3 gives vo = 4. Then —v2g + v3 = 5 
gives v3 = 9. 

Figure 4.7 shows the three columns a;,a@2,a3. When you combine them with 1, 4,9 
you produce b = (1,3,5). In reverse, v = (1,4,9) must be the solution to Av = D. 


3 
0 
a3 = 0 
1 
2 
1 
a, = —l a2 = 
0 1 


Figure 4.7: Independent columns a, @2, a3 not in a plane. Dependent columns cj, C2, C3 
are three vectors all in the same plane. 
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Example 5 Singular matrix: no solution to Cv = b or infinitely many solutions 
(depending on b). 
W1 — W3 = by 1 Q -l Wy 1 0 1 
—w, + we = be —1 1 0 wo} =]3] or |0O] or 2]. (9) 
—W2 + W3 = bs 0 -l 1 Ww3 5 0 —3 


This matrix C’ is a “circulant.” The diagonals are constants, all 1’s or all 0’s or all —1’s. The 
diagonals circle around so each diagonal has three equal entries. Circulant matrices will be 
perfect for the Fast Fourier Transform (FFT) in Chapter 8. 

To see if Cw = b has a solution, add those three equations to get 0 = 6b; + bg + bs. 


Left side (w1 — w3)+(—wi + we) + (—w2 + w3) = 0. (10) 


Cw = b cannot have a solution unless 0 = 6; + bg + b3. The components of b = (1,3, 5) 
do not add to zero, so Cw = (1, 3,5) has no solution. 

Figure 4.7 shows the problem. The three columns of C lie in a plane. All combina- 
tions Cw of those columns will lie in that same plane. If the right side vector b is not 
in the plane, then Cw = b cannot be solved. The vector b = (1,3,5) is off the plane, 
because the equation of the plane requires b; + bo + b3 = 0. 

Of course Cw = (0,0,0) always has the zero solution w = (0,0,0). But when the 
columns of C are in a plane (as here), there are additional nonzero solutions to Cw = 0. 
Those three equations are w1 = wg and w; = we and we = ws. The null solutions 
are Wp, = (€, c,c). When all three components are equal, we have Cw, = 0. 

The vector b = (1,2,—3) is also in the plane of the columns, because it does have 
b; + bg + b3 = 0. In this good case there must be a particular solution to Cw, = b. 
There are many particular solutions w,, since any solution can be a particular solution. 
I will choose the particular w, = (1, 3,0) that ends in w3 = 0: 


P : “; : : The complete solution is 
Oy = | — = = 
0-1 Lt} Le a “complete = Wp + any Wn 


Summary These two matrices A and C, with third columns a3 and c3, allow me to 
mention two key words of linear algebra: independence and dependence. This book will 
develop those ideas much further. I am happy if you see them early in the two examples: 


a1,@2,@3 areindependent A iisinvertible Av = bhas one solution v 


C1, C2, C3 are dependent C is singular Cw = 0 has many solutions w,, 


Eventually we will have n column vectors in n-dimensional space. The matrix will be 
n by n. The key question is whether Av = 0 has only the zero solution. Then the columns 
don’t lie in any “hyperplane.” When columns are independent, the matrix is invertible. 
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Problem Set 4.1 


Problems 1-8 are about the row and column pictures of Av = b. 


1 


With A = I (the identity matrix) draw the planes in the row picture. Three sides of a 
box meet at the solution v = (x, y, z) = (2,3, 4): 


l<a+0y+0z=2 1 0 0 x 2 
Oxr+ly+0z=3 or 0 1 0 yo Sl 
Oxr+0y+1z=4 0 0 1 z 4 


Draw the four vectors in the column picture. Two times column 1 plus three times 
column 2 plus four times column 3 equals the right side b. 


If the equations in Problem 1 are multiplied by 2, 3, 4 they become DV = B: 


27+0y+0z= 4 2 0 0 x 4 
0r+3y+0z= 9 or DV=]0 3 0 yl) =|] 9) =B 
Ox + Oy + 4z = 16 0 0 4 z 16 


Why is the row picture the same? Is the solution V the same as v? What is changed 
in the column picture—the columns or the right combination to give B? 


If equation | is added to equation 2, which of these are changed: the planes in the row 
picture, the vectors in the column picture, the coefficient matrix, the solution? The 
new equations in Problem 1 would be x = 2,x7+y=5,2=4. 


Find a point with z = 2 on the intersection line of the planes x + y + 3z = 6 and 
x —y+z2=4. Find the point with z = 0. Find a third point halfway between. 


The first of these equations plus the second equals the third: 
pt a a 


E+2y+ z2=3 
22 + 3y4+2z=5. 


The first two planes meet along a line. The third plane contains that line, because 
if x,y,z satisfy the first two equations then they also . The equations have 
infinitely many solutions (the whole line L). Find three solutions on L. 


Move the third plane in Problem 5 to a parallel plane 2z + 3y + 2z = 9. Now the three 
equations have no solution—why not? The first two planes meet along the line L, but 
the third plane doesn’t that line. 


In Problem 5 the columns are (1,1, 2) and (1,2, 3) and (1, 1,2). This is a “singular 
case” because the third column is . Find two combinations of the columns that 
give b = (2,3,5). This is only possible for b = (4, 6,c) ifc = 
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8 Normally 4 “planes” in 4-dimensional space meet at a -. Normally 4 
vectors in 4-dimensional space can combine to produce b. What combination 
of (1,0,0,0), (1,1,0,0), (1,1,1,0), (1,1,1,1) produces b = (3,3, 3,2)? 


Problems 9-14 are about multiplying matrices and vectors. 


9 Compute each Az by dot products of the rows with the column vector: 
12 4)/2 1210)|1 
(a)| —2 3 1 2 (b) 
a 3 Opes at. 1 
O20 ab 2 


10 Compute each Az in Problem 9 as a combination of the columns: 


1 2 
9(a) becomes Ar =2]-2]/ +2] 3 /+3)] 1] = 
—4 1 2 


How many separate multiplications for Aa, when the matrix is “3 by 3”? 


11. Find the two components of Ax by rows or by columns: 
3 
2 3/14 d 3. 6 2 a Te22e24 1 
Be ee Lee ta et “eso ts 


12 Multiply A times z to find three components of Az: 


Owed’ #1. Oy 2. Aes 1 2 1 1 
Ore §0 y and heey eek! 1 and Tes2 1 | ; 
LO" 0 z 3, 23° 6 = 3.3 
13 (a) A matrix with m rows and n columns multiplies a vector with compo- 
nents to produce a vector with components. 


(b) The planes from the m equations Aw = bare in -dimensional space. The 
combination of the columns of A is in -dimensional space. 


14 =Write 2x + 3y + z+ 5t = 8 as a matrix A (how many rows?) multiplying the column 
vector z = (z,y,z,t) to produce b. The solutions fill a plane or “hyperplane” 
in 4-dimensional space. The plane is 3-dimensional with no 4D volume. 


Problems 15-22 ask for matrices that act in special ways on vectors. 


x 


15 (a) Whatis the 2 by 2 identity matrix? J times | } | equals [¥ ]. 


(b) What is the 2 by 2 exchange matrix? P times | }] equals [¥ ]. 
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16 (a) What 2 by 2 matrix R rotates every vector by 90° ? R times ea is [ S| 


en he 


(b) What 2 by 2 matrix R? rotates every vector by 180° ? 


17 ‘Find the matrix P that multiplies (x, y, z) to give (y,z,2). Find the matrix Q that 
multiplies (y, z, 2) to bring back (z, y, z). 


18 What 2 by 2 matrix F subtracts the first component from the second component ? 
What 3 by 3 matrix does the same ? 


j j 3 
e|§|-[3| and E|\5/= 
5 2 


19 What 3 by 3 matrix FE multiplies (z, y, z) to give (x,y,z + x) ? What matrix E7! 
multiplies (x, y, z) to give (x,y,z — x) ? If you sea (3,4,5) by & and then 
multiply by E'~', the two results are (___) and ( 


“IN Ww 


20 What 2 by 2 matrix P; projects the vector (x, y) onto the x axis to produce (x,0) ? 
What matrix P2 projects onto the y axis to produce (0,y) ? If you multiply (5, 7) 
by P; and then multiply by P2, you get ( ) and ( : 


21. What 2 by 2 matrix R rotates every vector through 45° ? The vector (1,0) goes 
to (\/2/2, /2/2). The vector (0,1) goes to (—\/2/2, 2/2). Those determine the 
matrix. Draw these particular vectors in the xy plane and find R. 


22 ~=Write the dot product of (1,4,5) and (,y,z) as a matrix multiplication Av. The 
matrix A has one row. The solutions to Av = Olie ona ____ perpendicular to the 
vector ____. The columns of A are only in -dimensional space. 


23. ~=In MATLAB notation, write the commands that define this matrix A and the column 
vectors v and b. What command would test whether or not Av = b? 


=|e@) Popa} Ply, 


24 ‘If you multiply the 4 by 4 all-ones matrix A = ones(4) and the column v = ones(4,1), 
what is Axv ? (Computer not needed.) If you multiply B = eye(4) + ones(4) times 
w = zeros(4,1) + 2*ones(4,1), what is Bew ? 


Questions 25-27 review the row and column pictures in 2, 3, and 4 dimensions. 
25 Draw the row and column pictures for the equations z — 2y = 0,7+y=6. 


26 _—- For two linear equations in three unknowns 2, y, z, the row picture will show (2 or 3) 
(lines or planes) in (2 or 3)-dimensional space. The column picture is in (2 or 3)- 
dimensional space. The solutions normally lieona 


27 ‘For four linear equations in two unknowns z and y, the row picture shows four 
The column picture is in ____-dimensional space. The equations have no solution 
unless the vector on the right side is a combination of 
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28 


29 


30 


31 


32 


Challenge Problems 


Invent a 3 by 3 magic matrix M3 with entries 1,2,...,9. All rows and columns 
and diagonals add to 15. The first row could be 8,3, 4. What is M3 times (1,1, 1) ? 
What is M4 times (1, 1,1, 1) if a 4 by 4 magic matrix has entries 1,..., 16? 


Suppose u and v are the first two columns of a 3 by 3 matrix A. Which third columns 
w would make this matrix singular ? Describe a typical column picture of Av = bin 
that singular case, and a typical row picture (for a random b). 


Multiplying by A is a “linear transformation”. Those important words mean: 
If w is a combination of u and v, then Aw is the same combination of Au and Av. 


It is this “linearity” Aw = cAu + dAv that gives us the name linear algebra. 


Ifu = : and v = : then Au and Av are the columns of A. 


Combine w = cu + dv. If w = | ; how is Aw connected to Au and Av ? 


AQ by 9 Sudoku matrix S has the numbers 1,...,9 in every row and column, and 
in every 3 by 3 block. For the all-ones vector v = (1,..., 1), what is Sv ? 


A better question is: Which row exchanges will produce another Sudoku matrix ? 
Also, which exchanges of block rows give another Sudoku matrix ? 


Section 4.5 will look at all possible permutations (reorderings) of the rows. I see 
6 orders for the first 3 rows, all giving Sudoku matrices. Also 6 permutations of the 
next 3 rows, and of the last 3 rows. And 6 block permutations of the block rows ? 


Suppose the second row of A is some number c times the first row: 


(omen) 
= | ca cb ; 
Then if a 4 O, the second column of A is what number d times the first column ? 


A square matrix with dependent rows will also have dependent columns. This is 
a crucial fact coming soon. 
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4.2 Solving Linear Equations by Elimination 


This section explains a systematic way to solve linear equations—the best way we know. 
The method is called “elimination”, and you can see it in this 2 by 2 example. Before 
elimination, xz and y appear in both equations. After elimination, the first unknown x has 
disappeared from the second equation 5y = 5. 


x—2y=1 (multiply equation | by 2) After me 


2x + y=T7 ~ (subtract to eliminate 2x ) chiar =5 


The new equation 5y = 5 instantly gives y = 1. Substituting y = 1 back into the first 
equation leaves x — 2 = 1. Therefore x = 3 and the solution (z, y) = (3, 1) is complete. 


Elimination produces an upper triangular system—this is the goal. The nonzero co- 
efficients 1,—2,5 form a triangle. That system is solved from the bottom upwards, first 
y = 1 and then x = 3. This quick process is called back substitution. It is used for upper 
triangular systems of any size, after elimination produces a triangle. 

Important point: The original equations have the same solution x = 3 and y = 1. Before 
and after elimination, the lines meet at the same point (3, 1). Every step worked with both 
sides of correct equations. 


The step that eliminated x from equation 2 is the fundamental operation in this chapter. 
We use it so often that we look at it closely : 


To eliminate 2x: Subtract a multiple of equation 1 from equation 2. 


Two times x — 2y = 1 gives 2x — 4y = 2. When this is subtracted from 22 + y = 7, 
the right side becomes 7 — 2 = 5. The main point is that 2% cancels 27. The system 
becomes triangular. 

Ask yourself how that multiplier £ = 2 was found. The first equation contains 1x. So the 
first pivot was 1 (the coefficient of 7). The second equation contains 27, so the multiplier 
was 2. Then subtraction 2x — 2x produced the zero and the triangle. 

You will see the multiplier rule if I change the first equation to 37 — 6y = 3. 
(Same straight line but the first pivot becomes 3.) The correct multiplier is now € = 4 
To find that multiplier, divide the coefficient “2” to be eliminated by the pivot “3”: 


clr 


3c —6y=3 Multiply equation 1 by 2 32 — by =3 
2e+y =7 Subtract from equation 2 by = 5. 
The final system is triangular and the last equation still gives y = 1. Back substitution 


produces 3x — 6 = 3 and 3x = 9 and x = 3. We changed the numbers but not the lines 


or the solution. Divide by the pivot to find that multiplier £ = : : 


Pivot = first nonzero in the row that does the elimination 


Multiplier (entry to eliminate) divided by (pivot) 
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The new second equation starts with the second pivot, which is 5. We would use it to 
eliminate y from the third equation if there were one. To solve n equations we want n 
pivots. The pivots are on the diagonal of the triangle after elimination. 

You could have solved those equations for x and y without reading this book. It is an 
extremely humble problem, but we stay with it a little longer. Even for a 2 by 2 system, 
elimination might break down. By understanding the possible breakdown (when we can’t 
find a full set of pivots), you will understand the whole process of elimination. 


Breakdown of Elimination 


Normally, elimination produces the pivots that take us to the solution. But failure is possible. 
At some point, the method might ask us to divide by zero. We can’t do it. The process has to 
stop. There might be a way to adjust and continue—or failure may be unavoidable. 


Example | fails with no solution to Oy = 5. Example 2 fails with too many solutions to 
Oy = 0. Example 3 succeeds by exchanging the equations. 


Example 1 Permanent failure with no solution. Elimination makes this clear : 


x-—2y=1 Subtract 2 times x—2y=1 
2x —4y=7 — eqn. 1 from eqn. 2 Oy = 5. 


There is no solution to Oy = 5. This system has no second pivot. (Zero is never allowed as a 
pivot!) If there is no solution, elimination discovers that fact by reaching an 
impossible equation like Oy = 5. 

The row picture of failure shows parallel lines—which never meet. The column picture 
shows the two columns (1,2) and (—2, —4) in the same direction. All combinations of the 
columns lie along a line. But the column from the right side is in a different direction (1, 7). 
No combination of the columns can produce this right side—therefore no solution. 

When we change the right side from (1,7) to (1,2), failure shows as a whole line of 
solution points. Instead of no solution, Example 2 changes to infinitely many solutions. 


Example 2 Failure with infinitely many solutions. Change b = (1,7) to (1,2). 


x—2y=1 Subtract 2 times x-—2y=1 Too few pivots 
2x —4y=2 eqn. 1 fromegn. 2 Oy =0 ‘Too many solutions 


Every y satisfies Oy = 0. There is really only one equation x — 2y = 1. The unknown y is 
“free’’. After y is freely chosen, x is determined as x = 1 + 2y. I prefer to see a particular 
solution vp, = (1,0) and a line of null solutions v;, = c (2,1) inv = vp + Un. 


1 


2 
0 +e Hl = particular v, + null v,. (1) 


x 
Complete solution | | - | 
y 


In the row picture, the parallel lines have become the same line. Every point (x, y) on that 
line satisfies both equations. 


212 Chapter 4. Linear Equations and Inverse Matrices 


In the column picture, b = (1, 2) is now the same as column 1. So we can choose x = 1 
and y = 0. We can also choose x = 0 and y = $3 column 2 times —$ equals b. Every 
(x, y) that solves the row problem also solves the column problem. 


Failure For n equations we do not get n pivots. The rows combine into a zero row. 


Success We do get n pivots. But we may have to exchange the n equations. 


Elimination can go wrong in a third way—but this time it can be fixed. Suppose the first 
pivot position contains zero. We refuse to allow zero as a pivot. When the first equation has 
no term involving x, we can exchange it with an equation below: 


Example 3. Yemporary failure (zero in pivot ). A row exchange produces two pivots : 


Or + 2y =4 Exchange the 3r'-— 2y =5 
3x — 2y=5 two equations Qy = 4. 


The new system is already triangular. This small example is ready for back substitution. The 
last equation gives y = 2, and then the first equation gives x = 3. The row picture is normal 
(two intersecting lines). The column picture is also normal (column vectors not in the same 
direction). The pivots 3 and 2 are normal—but a row exchange was required. 

Examples 1 and 2 are singular—there is no second pivot. Example 3 is nonsingular— 
there is a full set of pivots and exactly one solution. Singular equations have no solution or 
infinitely many solutions. Pivots must be nonzero because we have to divide by them. 


Three Equations in Three Unknowns 


To understand Gaussian elimination, you have to go beyond 2 by 2 systems. Three by three 
is enough to see the pattern. For now the matrices are square—an equal number of rows 
and columns. Here is a 3 by 3 system, specially constructed so that all steps lead to whole 
numbers and not fractions : 


22+ 4y—2z=2 
Ax + 9y — 3z =8 (2) 
—2z —3y+ 7z = 10 


What are the steps? The first pivot is the boldface 2 (upper left). Below that pivot we want 
to eliminate the 4. The first multiplier is the ratio 4/2 = 2. Multiply the pivot equation by 
é2, = 2 and subtract. Subtraction removes the 4z from the second equation: 


Step 1 Subtract 2 times equation 1 from equation 2. This leaves y + z = 4. 


We also eliminate —2z from equation 3, still using the first pivot. The quick way is to add 
equation | to equation 3. Then 2z cancels —2z. We do exactly that, but the rule in this book 
is to subtract rather than add. The systematic pattern has multiplier ?3; = —2/2 = —1. 
Subtracting —1 times an equation is the same as adding: 


Step 2 Subtract —1 times equation 1 from equation 3. This leaves y + 5z = 12. 
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The two new equations involve only y and z. The second pivot (in boldface) is 1: 


SG geet ly+1z=4 
z is eliminated iyeee 10 
We have reached a 2 by 2 system. The final step eliminates y to make it 1 by 1: 

Step 3 Subtract equation 2new from 3new. The multiplier is 1/1 = 1. Then 4z = 


The original Av = b has been converted into an upper triangular Uv = c: 


2x +4y —22=2 Av=b 22+ 4y—-—22z=2 
4¢ + 9y —3z=8 has become lyt+1z=4 (3) 
—2x —3y+7z=10 Uv=c 4z =8. 


The goal is achieved—forward elimination is complete from A to U. The pivots are 
2,1,4 on the diagonal of U. The pivots 1 and 4 were hidden in the original system. 
Elimination brought them out. Uw = c is ready for back substitution, which is quick : 


(42 =8 gives z=2) (y+z=4 gives y=2) (equation | gives x = —1) 


The solution is (x,y,z) = (—1,2,2). The row picture has three planes from the three 
equations. All the planes go through this solution. This picture is not easy to draw (it is 
totally impossible for larger systems). 

The column picture shows a combination Av of column vectors producing the right side 
b. The coefficients in that combination are —1, 2, 2 (the solution) : 


2 4 —2 2 
Av=(-1)| 4]+2] 9]+2)]-3] equals | 8] =b. (4) 
—2 —3 7 10 


The numbers «, y, z multiply columns 1, 2,3 in Av = b and also in the triangular Uv = c. 


For a 4 by 4 problem, or an n by n problem, elimination proceeds the same way. 
Here is the whole idea, column by column from A to U, when elimination succeeds. 


Column 1. Use the first equation to create zeros below the first pivot. 
Column 2. Use the new equation 2 to create zeros below the second pivot. 


Columns 3 to n. Keep going to find all n pivots and the triangular U. 


a 
z 
We want U = (5) 


x 
After column 2 we have ; 


COs 8 
88s &B 
bor ac ais, Pe | 

88 8 
888 8 


0250 


The result of forward elimination is an upper triangular system. The matrix will be 
nonsingular (= invertible) if and only if there is a full set of n pivots (never zero!). 
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Here is a final example to show the original Av = 5, the triangular system Uv = c, and 
the solution v = (x, y, z) from back substitution : 


Dat pats ea) a+y+z=6 ur 3 Back 
x+2y+2z=9 Forward ytz2=3 yj =] 2 Back 
x+2y+3z=10 Forward z= Zz 


All multipliers are 1. All pivots are 1. All planes meet at the solution v = (3,2, 1). 
The columns of A combine with coefficients 3, 2, 1 to give b = (6,9, 10): 


I 3 1 1 ] 
Av= | 1 2.2 2)/=3)/ 1)4+2)] 2)4+1)]/ 2)= 9 
12 3 1 1 2 3 10 


The numbers 6,9,10 are dot products. The first number 6 is the dot product of the 
first row (1, 1,1) with v = (3, 2,1). 


Question What coefficient of z in equation 3 would make the system singular ? 
Answer The third pivot would drop from | to 0 if the original 3z dropped to 2z. Then the 
planes in the row picture have no point in common. 

There is no solution to the new Av = 6b. The three columns in the column picture 
would lie in the same plane, and b = (6,9,10) is not in that plane. So b will not be a 
combination of the columns, if the third column becomes (1,2,2). In this example 
column 3 becomes the same as column 2—useless, we need “independent” columns ! 


Question What coefficient of y in equation 2 would become 0 in the first elimination step ? 
Would the system become singular or not? 


Answer Change equation 2 to x + y + 2z = 7 (for example). The coefficient of y 
is now 1. Subtracting equation 1 leaves Oy + z = 3. Now we can exchange equations 
2 and 3. This system is nonsingular. No problem except equations in the wrong order. 


= REVIEW OF THE KEYIDEAS #8 


1. A linear system Av = b becomes upper triangular (Uv = c) by elimination. 
2. We subtract £;; times equation 7 from equation 2, to make the (2, 7) entry zero. 


_ entry to eliminate in row i 


3. The multiplier is ¢;; pivot in row j . Pivots can not be zero ! 


4. A zero in the pivot position can be exchanged if there is a nonzero below it. 
5. Back substitution solves the upper triangular system (bottom to top). 


6. When breakdown is permanent, the system has no solution or infinitely many. 
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Problem Set 4.2 


Problems 1-10 are about elimination on 2 by 2 systems. 


1 


What multiple €21 of equation 1 should be subtracted from equation 2 ? 


20-3 y-S1 
102 + 9y = 11. 
After this step, solve the triangular system by back substitution, y before x. Verify that 
x times (2, 10) plus y times (3, 9) equals (1,11). If the right side changes to (4, 44), 


what is the new solution ? 


If you find solutions v and w to Av = b and Aw = c, what is the solution u to 
Au = b+ cc? What is the solution U to AU = 36+ 4c? (We saw superposi- 
tion for linear differential equations, it works in the same way for all linear equations.) 


What multiple of equation 1 should be subtracted from equation 2 ? 


2x —4y =6 
—x“2+5y=0. 


After this elimination step, solve the triangular system. If the right side changes to 
(—6, 0), what is the new solution ? 


What multiple ¢ of equation 1 should be subtracted from equation 2 to remove cx ? 


ax + by = f 


cx + dy = 4g. 


The first pivot is a (assumed nonzero). Elimination produces what formula for the 
second pivot ? The second pivot is missing when ad = bc: that is the singular case. 


Choose a right side which gives no solution and another right side which gives 
infinitely many solutions. What are two of those solutions ? 


3x + 2y = 10 
6x2 + 4y = 


Singular system 


Choose a coefficient b that makes this system singular. Then choose a right side g that 
makes it solvable. Find two solutions in that singular case. 


2x + by = 16 
4x + 8y=4gQ. 
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7 


10 


11 


For which a does elimination break down (1) permanently or (2) temporarily ? 
ax + 3y=—-3 
4r+6y= 6. 

Solve for x and y after fixing the temporary breakdown by a row exchange. 


For which three numbers k does elimination break down? Which is fixed by a row 
exchange ? In these three cases, is the number of solutions 0 or 1 or co? 

kx +3y= 6 

32+ ky = —6. 


What test on b, and by decides whether these two equations allow a solution? How 
many solutions will they have ? Draw the column picture for b = (1, 2) and (1,0). 


3x2 — 2y = bi 
OXi=: Ay = bo. 
In the zy plane, draw the lines x+ y = 5andx+ 2y = 6 andthe equationy = 


that comes from elimination. The line 52 — 4y = c will go through the solution of 
these equations ifc= 


(Recommended) A system of linear equations can’t have exactly two solutions. If 
(x, y) and (X, Y) are two solutions to Av = b, what is another solution ? 


Problems 12-20 study elimination on 3 by 3 systems (and possible failure). 


12 


13 


14 


Reduce this system to upper triangular form by two row operations: 


22+3y+z =8 
Eliminate « — 4r+7y + 5z = 20 
Eliminate y — —2y+2z=0. 


Circle the pivots. Solve by back substitution for z, y, x. 

Apply elimination (circle the pivots) and back substitution to solve 
2x — 3y =3. 
4¢—5y+ z2=7 
2e— y—3z=5. 


List the three row operations: Subtract times row from row 


Which number d forces a row exchange? What is the triangular system (not singular) 
for that d? Which d makes this system singular (no third pivot) ? 
27+ 5y+z2=0 
4e+dyt+z=2 
y—z=3. 
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17 


18 


19 


20 


21 


22 


23 


Which number 6 leads later to a row exchange ? Which b leads to a singular problem 
that row exchanges cannot fix ? In that singular case find a nonzero solution 2, y, z. 


x + by =0 
z—2y—z=0 
ytz=0. 


(a) Construct a 3 by 3 system that needs two row exchanges to reach a triangular 
form. 

(b) Construct a 3 by 3 system that needs a row exchange for pivot 2, but breaks down 
for pivot 3. 


If rows 1 and 2 are the same, how far can you get with elimination (allowing row 
exchange) ? If columns | and 2 are the same, which pivot is missing ? 


Equal 27- y+z=0 2e+2y+z2=0 Equal 
rows 2x—-—yt+z=0 4x +4y+2z=0 columns 
4a+y+tz=2 62 + 6y+2= 2. 


Construct a 3 by 3 example that has 9 different coefficients on the left side, but 
rows 2 and 3 become zero in elimination. How many solutions to your system with 
b = (1, 10, 100) and how many with b = (0, 0,0)? 


Which number g makes this system singular and which right side ¢ gives it infinitely 
many solutions ? Find the solution that has z = 1. 
a+4y—-—2z=1 
z+7y—6z=6 
3yt+qz=t. 
Three planes can fail to have an intersection point, even if no planes are parallel. 


The system is singular if row 3 is a combination of the first two rows. Find a third 
equation that can’t be solved together with z+ y+ z=Oandx—-2y—z=1. 


Find the pivots and the solution for both systems (Av = b and Sw = b): 


22+ y =0 20 Y =0 
xr+2y4+ z =7() —“r+2y-— z =0 
yt2z+ t=0 — y+2z-— t=0 
z+2t=5 = 2+26=5. 


If you extend Problem 21 following the 1,2,1 pattern or the —1,2,—1 pattern, 
what is the fifth pivot ? What is the nth pivot ? S is my favorite matrix. 


If elimination leads to z + y = 1 and 2y = 3, find three possible original problems. 
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For which two numbers a will elimination fail on A = h | ? 


For which three numbers a will elimination fail to give three pivots ? 


a 


ZAKS 
A= a 4} is singular for three values of a. 
a a 


a 
a 
Look for a matrix that has row sums 4 and 8, and column sums 2 and s: 


Mie b a+b=4 at+c=2 
ere = (Mand e+d=8 b+d=s 


The four equations are solvable only if s = . Then find two different matri- 
ces that have the correct row and column sums. Extra credit: Write down the 4 by 4 
system Av = (4,8, 2,8) with v = (a,b,c,d) and make A triangular by elimination. 


Elimination in the usual order gives what matrix U and what solution (z, y, z) to 
this “lower triangular” system? We are really solving by forward substitution : 


3x = 3 
6x + 2y =8 
9x —2Qy+z=9. 


Create a MATLAB command A(2,: ) =... for the new row 2, to subtract 3 times 
row 1 from the existing row 2 if the matrix A is already known. 


If the last corner entry of A is A(5,5) = 11 and the last pivot of A is 
U(5, 5) = 4, what different entry A(5, 5) would have made A singular ? 


Challenge Problems 


Suppose elimination takes A to U without row exchanges. Then row i of U is a 
combination of which rows of A? If Av = 0,is Uv = 07 If Av = b,is Uv = b? 


Start with 100 equations Av = 0 for 100 unknowns v = (v1,...,U100). Suppose 
elimination reduces the 100th equation to 0 = 0, so the system is “singular”. 


(a) Elimination takes linear combinations of the rows. So this singular system has 
the singular property : Some linear combination of the 100 rowsis 


(b) Singular systems Av = 0 have infinitely many solutions. This means that some 
linear combination of the 100 columns is 


(c) Invent a 100 by 100 singular matrix with no zero entries. 


(d) For your matrix, describe in words the row picture and the column picture of 
Av = 0. Not necessary to draw 100-dimensional space. 
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4.3. Matrix Multiplication 


We know how to multiply A times a column vector v. Now we want to multiply A times a 
matrix B (matrix-matrix multiplication ). The rule is exactly what we would hope for: 


Multiply A times each column of B to get a column of AB 


The entry in row i, column j of AB is (row of A)-(column 7 of B) 


If B has only one column (call it v), this is the same matrix-vector multiplication as before. 
When B has n columns, so has AB. The rule for matrix sizes makes dot products possible. 


Rule The number of columns in A must match the number of rows in B. 


Figure 4.8 shows a typical (row 7) - (column 7) in the matrix multiplication AB. 


* Fe : 
ai Ai2 Qi5 _ [* * (AB);; kK ok Ox 
* 
* 
bs; 
Ais 4by 5 Bis d5by6 ABis4by6 


Figure 4.8: Here 1 = 2 and j = 3. Then (AB)g3 is (row2 of A) - (column 3 of B). 


Let me say right away that normally AB is entirely different from BA. Those have 
different shapes unless A and B are square and the same size. But even the top left corner 
of BA has nothing to do with the top left corner of AB (and then BA 4 AB). 


Top left (row 1 of B)-(column 1 of A) 4 (row 1 of A)-(column 1 of B). 
Example 1 Here A has two columns and B has two rows. We can multiply AB. 


= Gan 0 4 | |a. b ab 
A2 x 2 52x 3=(AB)2x 3 ie ale 1 l=: d eae 


Column 3 of B is (1,1). Then column 3 of AB is A times (1, 1). 
Example 2 Here B is the 3 by 3 identity matrix (very special, always written B = I). 


B = Identity matrix I ee a ee 
AI = A when si ight se Seale ulneaes 
= A when sizes are rig L393 O71 cot oe ae ae 


The first column of that answer is A times the first column (1, 0, 0) of B = J. This just 
reproduces the first column of A. Each column of A is unchanged in AT. 

Now put the identity matrix first, as in JB. Multiplication gives /B = B for every B 
(including B = A). We have here an unusual case, when the order AJ gives the same answer 
as IA. If A is any square matrix and J has the same size, then AT = [A = A. 
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Example 3 Another special matrix is the inverse of A. That matrix B is written A~!: 


ile Ree | 2 -1 0 OS 50 
A times A~? is I ee St, #2. Sd) [Sa F0. 1 <0 
eek! QO -1 1 0 0 1 


The dot product of a row of A with a column of A7! is 1 or 0. Aq! times A is also J. 
To find that matrix A~!, I had to look ahead to Section 4.4— this is a long calculation. 
We avoid computing A~! wherever possible, and so does any good linear algebra code. 


The key fact about matrix multiplication is that (AB)C = A(BC). (1) 


To multiply three matrices A, B,C you must keep them in order. But you can choose to 
multiply AB first or BC first. Parentheses can be moved, and parentheses can be removed. 


Example 4 Suppose A and C are 3 by 1 matrices (those are column vectors ). Suppose B 
is 1 by 3 (a row vector ). Compute and compare (AB)C and A(BC). 


Solution BC is (1 x 3) times (3 x 1) = 1 x 1. One number d from one dot product : 


ay Cl a\d 
A times BC a [by be b3] | ce =| aed |. (2) 
a3 C3 agd 


On the other hand, AB is (3 x 1) times (1 x 3) = 3 x 3. This AB is a full-size matrix ! 


ay Cy a,b, aybz aib3 | | cr 
AB times C a2 [ by ba bs | Cc) = ab, agbe anb3 (om es (3) 
ag C3 a3b, agb2 a3b3 | | cg 


If you multiply that first row of AB times C’, you will see aid. Multiplying the other rows 
by C gives acd and a3d. (AB)C in equation (3) equals A( BC) in equation (2). 


The Laws for Matrix Operations 


May I put on record six laws that matrices do obey, while emphasizing an equation they 
don’t obey ? The matrices can be square or rectangular, and the laws involving A + B are all 
simple and all obeyed. Here are three addition laws: 


A+B =B+A (commutative law) 
c(A+ B) =cA+cB (distributive law) 
A+(B+C) =(A+B)4+C (associative law). 


Three more laws hold for multiplication, but AB = BA is not one of them: 
AB#BA (the commutative “law” is usually broken) 


A(B+C) = AB+ AC (distributive law from the left) 
(A+ B)C =AC+BC (distributive law from the right) 


A(BC) = (AB)C (associative law for ABC) (parentheses not needed). 
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When A and B are not square, AB is a different size from BA. These matrices can’t be 
equal—even if both multiplications are allowed. For square matrices, almost any example 
shows that AB is different from BA: 


so-(0 10 dele] = [8 IG 2 8) 


It is true that AJ = JA. All square matrices commute with J and also with cl. Only these 


matrices cJ commute with all other matrices. 
The law A(B + C) = AB + AC is proved a column at a time. Start with A(b + c) = 
Ab + Ac for the first column. That is the key to everything—linearity. Say no more. 


Powers of Matrices 


Look at the special case when A = B = C = square matrix. Then (A times A”) is equal 
to (A? times A). The product in either order is 4°. The matrix powers A? follow the same 
rules as numbers : 


AP = AAA---A(pfactors)  (AP)(A2) = AP? (AP) = APO. 


Those are the ordinary laws for exponents. A? times A* is A’ (seven factors). A? to 
the fourth power is A!” (twelve A’s). When p and q are zero or negative these rules still 
hold, provided A has a “—1 power”—which is the inverse matrix A~!. Then A° = J is the 
identity matrix (no factors). 

For a number, a~! is 1/a. For a matrix, the inverse is written Aq}. (It is never T/A. 
But backslash A\J is allowed in MATLAB.) Every number has an inverse except a = 0. 
To decide when A has an inverse is a central problem in linear algebra. This section is 
like a Bill of Rights for matrices, to say when A and B can be multiplied and how. 


Elimination Matrices 


We now combine two ideas—elimination and matrices. The goal is to express all the steps 
of elimination in the clearest possible way. You will see how to subtract a multiple £;; times 
row j from row i—using a matrix EF. 

The column vector b is multiplied by the elimination matrix EF: 


EL 7D: 36 by by 
Subtract 2b, from b> Eb=);} —2 1 O bo | = | b2 —2b, |. (4) 
0 0 J b3 b3 | 


Whatever we do to one side of Av = 6, we do to the other side. Elimination is multiplying 
both sides by E’. On the left side, we see row operations. 


ie 0 hae 0) row 1 row 1 
EA= —2 1 0 row 2 | = | row2— 2rowl |. (5) 
Ov Oi: row 3 row 3 
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EA will be our matrix after the first elimination step. The multiplier 2 was chosen to 
produce 0 in the 2, 1 position (row 2, column 1). This matrix E’ should be named EF; 
because it eliminates the original entry a2, to leave zero. 

The next step of elimination comes from a matrix £3; (producing zero in place of a3;). 
Then £32 produces zero in row 3, column 2, using a multiplier £32. Altogether, the three 
steps from A to the upper triangular U come from three elimination matrices : 


Elimination by matrices A becomes F393; 2,A =U (upper triangular). 


We do the same operations on the right side. £32 £3; £21b becomes the new right side vector 
c. Then back substitution solves Uv = ec. 


Example 5 Choose the multiplier 02; = c/a to produce zero in U2, using F = E21: 


EA=| Jig alle alee feoanie® (9) 


Undo this elimination by adding c/a times row 1 of U to row 2 of U: 


= 1 0O a b a ae BF | 
Ey=| alle: tien |= 13 | =4 


Thus U = EAand A = E~!U. Often we write this as A = LU. 


Four Ways to Multiply AB 


I will end this section by writing down four different ways to compute AB. All four ways 
give the same answer. In the end we are doing the same calculations, but we are seeing those 
steps in different orders. 


1. (Rows of A) times (columns of B) (dot products) 

2. A times (columns of B) (matrix-vector multiplications) 

3. (Rows of A) times B (vector-matrix multiplications ) 

4. (Columns of A) times (rows of B) — (add up n column-times-row matrices) 
Let me look at the 1, 1 entry in the top corner of AB. The usual way is a dot product: 

(row 1 of A) + (column 1 of B) = (AB)11 = @11611 + Qi2b21 +++ +@inbni (7) 
Orders 2 and 3 give that same dot product in AB. Here is order 4, columns times rows : 

ai aiibi1 


(column 1 of A)(row 1 of B)= | a2i | [ bir biz + | = . 3 (8) 


The next column-times-row matrix is (column 2 of A)(row 2 of B). That starts with 
a12b21 in the top left corner. We get a1;b;1 when column j of A multiplies row j of B. 
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Adding these simple matrices will produce the correct dot product (the sum of a;;,;1) in the 
top left corner—and in every entry of AB. 


When A and B are n by n matrices, so is AB. It contains n? dot products. So it needs 
n3 separate multiplications. For matrices of order n = 100 this is a million multiplications. 
No problem, that may only take one second (on the computer). 

When A is an m by n matrix and B is n by p, the product AB is m by p. It contains 
mp dot products. So it needs mnp separate multiplications. 

Matrices of order n = 10,000 need a trillion (102) multiplications. Codes avoid mul- 
tiplying full matrices whenever possible. And they watch especially for sparse matrices, 
when many of the entries (almost all) are zero. The codes don’t waste time multiplying by 
zero. 


Problem Set 4.3 


Problems 1-16 are about the laws of matrix multiplication . 


1 Ais 3 by 5, Bis 5 by 3, C is 5 by 1, and D is 3 by 1. All entries are 1. Which of these 
matrix operations are allowed, and what are the results ? 


BA AB ABD DBA A(B+C). 
2 What rows or columns or matrices do you multiply to find 


(a) the third column of AB? 
(b) the first row of AB ? 
(c) the entry in row 3, column 4 of AB? 


(d) the entry in row 1, column 1 of CDE? 


3 Add AB to AC and compare with A(B + C): 


1935 0:2 3 
A=[5 | and S|, i and ok ae 


4 In Problem 3, multiply A times BC’. Then multiply AB times C. 


5 Compute A? and A®. Make a prediction for A® and A”: 


ie) De 22. 
ee | and els a 


6 Show that (A + B)? is different from A? + 2AB + B?, when 


i ae 1 “0 
reper al 


Write down the correct rule for (4 + B)(A+B)=A?+ + B?. 
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True or false. Give a specific example when false : 


(a) If columns 1 and 3 of B are the same, so are columns | and 3 of AB. 
(b) If rows 1 and 3 of B are the same, so are rows | and 3 of AB. 

(c) If rows | and 3 of A are the same, so are rows | and 3 of ABC. 

(d) (AB)? = A*B?. 


How is each row of DA and FA related to the rows of A, when 


35-0 =) HOS al a fiengte 5 
Dl, a and B=) l and fee ale 


How is each column of AD and AF related to the columns of A? 


Row 1 of A is added to row 2. This gives F'A below. Then column 1 of FA is added 
to column 2 to produce (E.A)F’. Notice E and F in boldface. 


1 0 a b a b 
BA=|} Ha ieee pe 


ear=c4[2t]=[.0, atta]: 


Do those steps in the opposite order, first multiply AF and then E(AF’). Compare 
with (FA) F. What law is obeyed by matrix multiplication ? 


Row 1 of A is added to row 2 to produce E'A. Then F' adds row 2 of EA to row 1. 
Now F is on the left, for row operations. The result is F(A): 


a | a b 2a+e 2b+d 
= 0-4 | at+e b+d | az | a+ec b+d |. 

Do those steps in the opposite order: first add row 2 to row 1 by F'A, then add row 1 
of F’A to row 2. What law is or is not obeyed by matrix multiplication ? 
(3 by 3 matrices) Choose the only B so that for every matrix A 

(a) BA=4A 

(b) BA = 4B (tricky) 

(c) BA has rows 1 and 3 of A reversed and row 2 unchanged 

(d) All rows of BA are the same as row 1 of A. 


Suppose AB = BA and AC = CA for these two particular matrices B and C’: 


a b : AO) Ona 
eae fi commutes with ete | and o=|) al 


Prove that a = d and b = c = 0. Then A is a multiple of J. The only matrices that 
commute with B and C and all other 2 by 2 matrices are A = multiple of J. 
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13. Which of the following matrices are guaranteed to equal (A — B)?: A? — B?, 

(B — A)?, A? -2AB+ B®, A(A— B)-— B(A-B), A? -AB- BA+ B?? 
14 ~— ‘True or false: 


(a) If A? is defined then A is necessarily square. 

(b) If AB and BA are defined then A and B are square. 

(c) If AB and BA are defined then AB and BA are square. 
(d) If AB= BthenA= TJ. 


15 If Aismbyn, how many separate multiplications are involved when 


(a) A multiplies a vector 2 with n components ? 
(b) A multiplies an n by p matrix B? 


(c) A multiplies itself to produce A? ? Here m = n and A is square. 

16 For A= [3 ~3] and B = [18 4], compute these answers and nothing more : 
(a) column 2 of AB (b) row2of AB (c) row2of A? 
(d) row 2 of A®. 


Problems 17-19 use a;; for the entry in row 2, column 7 of A. 


17 Write down the 3 by 3 matrix A whose entries are 
(a) aj; = minimum of i and j (b) aj; = (—1)**9 C457) 7 


18 What words would you use to describe each of these classes of matrices? Give a 
3 by 3 example in each class. Which matrix belongs to all four classes ? 


(a) a =Oifi Aj (b) ay =Oifi<J7 (Cc) az =a53 
(d) aij = a1;. 

19 The entries of A are a;;. Assuming that zeros don’t appear, what is 
(a) the first pivot ? 
(b) the multiplier 23; of row 1 to be subtracted from row 3? 
(c) the new entry that replaces a39 after that subtraction ? 


(d) the second pivot? 
Problems 20-24 involve powers of A. 


20 Compute A?, A®, A4 and also Av, A?v, A?v, Atv for 


020 0 t 
_je@ 020] . _ly 
A=|o 002 and w=) 5 

000 0 t 
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21 +‘ Find all the powers A”, A?,... and AB,(AB)?,... for 
pe iB 1 0O 
A=| 3 | ‘is B=|4 Lab 
22 ‘By trial and error find real nonzero 2 by 2 matrices such that 


A?=-I BC=0 DE=~—ED (notallowing DE = 0). 


23 (a) Find a nonzero matrix A for which A? = 0. 


(b) Find a matrix that has A? 4 0 but A® = 0. 


24 _~—iBy experiment with n = 2 and n = 8 predict A” for these matrices : 


2, 1 1 a b 
A=|5 | and A= |} | and As=| 6 ar 


Problems 25-31 use column-row multiplication and block multiplication. 


25 Multiply A times J using columns of A (3 by 3) times rows of J. 


26 Multiply AB using columns times rows: 


3 


1 0 3 
AB = 24] 
2 1 LA 


27 ~—_‘Show that the product of two upper triangular matrices is always upper triangular: 
see a cg Se x 

ABH \0) sa oa Ovsre 25 =.) 20 
Oe 40% aX Oo 20) 0 0 « 


Proof using dot products (Row-times-column) (Row 2 of A)- (column 1 of B) = 0. 
Which other dot products give zeros ? 


Proof using full matrices (Column-times-row) Draw «’s and 0’s in (column 2 of A) 
times (row 2 of B). Also show (column 3 of A) times (row 3 of B). 


28 If Ais 2 by 3 with rows 1, 1, 1 and 2, 2, 2, and B is 3 by 4 with columns 1, 1, 1 and 2, 
2, 2 and 3, 3, 3 and 4, 4, 4, use each of the four multiplication rules to find AB: 
(1) Rows of A times columns of B. Inner products (each entry in AB) 
(2) Matrix A times columns of B. Columns of AB 
(3) Rows of A times the matrix B. Rowsof AB 
(4) Columns of A times rows of B. Outer products (3 matrices add to AB) 
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29 


30 


31 


32 


33 


34 


35 


Which matrices £2; and £3; produce zeros in the (2, 1) and (3, 1) positions of F2;A 
and F3,A ? 


Find the single matrix & = £3; Eo) that produces both zeros at once. Multiply FA. 


Block multiplication produces zeros below the pivot in one big step: 


1 0 a b a b : 
EA=| vq Alle B= o Do yo | With vectors 0,6, 6 


In Problem 29, what are c and D and what is the block D — c Ga? 


With i? = —1, the product of (A + 7B) and (a +iy) is Ax + iBa +iAy — By. Use 
blocks to separate the real part without 7 from the imaginary part that multiplies 7: 


A -B xz | | Ax—By | real part 
2 ? pili = ? imaginary part 


(Very important) Suppose you solve Av = b for three special right sides b: 


1 0 0 
Av; =| 0 and Awvo=] 1 and Avszg= | 0 
0 0 1 


If the three solutions v1, v2, v3 are the columns of a matrix X, what is A times X ? 


If the three solutions in Question 32 are v; = (1,1,1) and v2 = (0,1,1) and 
v3 = (0,0, 1), solve Av = b when b = (3,5, 8). Challenge problem: What is A ? 


Practical question Suppose A is m by n, B is n by p, and C is p by g. Then 
the multiplication count for (AB)C is mnp + mpg. The same answer comes from 
A times BC, now with mnq + npq separate multiplications. Notice npq for BC. 


(a) If Ais 2 by 4, B is 4 by 7, and C is 7 by 10, do you prefer (AB)C or A(BC) ? 
(b) With N-component vectors, would you choose (uw? v)w? or wl (vw) ? 
(c) Divide by mnpgq to show that (AB)C is faster when n~! + q7! <m7!+p7}!. 


Unexpected fact A friend in England looked at powers of a 2 x 2 matrix: 


ei ee) Dalian: 10 3__ [37 54 1_[A B 
A=[5 i ee a A= | a ae | at=|¢ D 


He noticed that the ratios 2/3 and 10/15 and 54/81 are all the same. This is true for 
all powers. It doesn’t work for ann x n matrix, unless A is tridiagonal. One neat proof 
is to look at the equal (1,1) entries of A” A and AA”. Can you use that idea to show 
that B/C = 2/3 in this example ? 
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4.4 Inverse Matrices 


Suppose A is a square matrix. We look for an “inverse matrix” A~' of the same size, so 
that A~! times A equals I. Whatever A does, A~! undoes. Their product is the identity 
matrix—which leaves all vectors unchanged, so A~! Av = v. But A~! might not exist. 
What a matrix mostly does is to multiply a vector v. Multiplying Av = b by Aq! 
gives A~1Av = A-!b. This is v = A~‘b. The product A~!A is like multiplying by a 


number and then dividing by that number. A number has an inverse if it is not zero—matrices 
are more complicated and more interesting. The matrix A~? is called “A inverse.” 
DEFINITION The matrix A is invertible if there exists a matrix A~? such that 
A*A=1 aa Ad =1. (1) 


Not all matrices have inverses. This is the first question we ask about a square matrix: 
Is A invertible? We don’t mean that we immediately calculate A~!. In most problems 
we never compute it! Here are six “notes” about A~?. 


Note 1 A! exists if and only if elimination produces n pivots (row exchanges 
are allowed). Elimination solves Av = b without explicitly using the matrix A~?. 


Note 2. The matrix A cannot have two different inverses. Suppose BA = I and also 
AC = I. Then B = C, according to this “proof by parentheses” : 


B(AC) =(BA)C gives BI=IC or B=C. (2) 


This shows that a left-inverse B (multiplying from the left) and a right-inverse C' (multiplying 
A from the right to give AC = I) must be the same matrix. 


Note3 __ If A is invertible, the one and only solution to Av = bis v = A7'b: 
Multiply Av=b by A™', Then v=A~'Av=A7'b. 


Note 4 (Important) Suppose there is a nonzero vector v such that Av = 0. Then 
A cannot have an inverse. No matrix can bring 0 back to v. 


If A is invertible, then Av = 0 can only have the zero solution v = A-!0=0. 


Note 5 A 2 by 2 matrix is invertible if and only if ad — be is not zero: 


4 
2 by 2 Inverse | a b i d —b | . (3) 


Divide by ad — be ed Sad =e | en 66 


This number ad — bc is the determinant of A. A matrix is invertible if its determinant is 
not zero. A~! always involves a division by the determinant of A. 
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Note 6 A diagonal matrix has an inverse provided no diagonal entries are zero: 


dy, 1/d, 
If A= re then A7) = % 
dn 1/dy, 


Example 1 The 2 by 2 matrix A = [}2] is not invertible. It fails the test in Note 5, 
because ad — bc equals 2 — 2 = 0. It fails the test in Note 3, because Av = O when 
v = (2,—1). It fails to have two pivots as required by Note 1. 

Elimination turns the second row of this matrix A into a zero row. 


The Inverse of a Product AB 


For two nonzero numbers a and 6, the sum a + b might or might not be invertible. The 
numbers a = 3 and b = —3 have inverses 4 and —4. Their sum a + b = 0 has no inverse. 
But the product ab = —9 does have an inverse, which is ; times —3. 

For two matrices A and B, the situation is similar. It is hard to say much about the 
invertibility of A + B. But the product AB has an inverse, if and only if the two factors 
A and B are separately invertible (and the same size). The important point is that A~! and 


B-! come in reverse order : 
If A and B are invertible then so is AB. The inverse of a product AB is 


(AB)-* = B-tA-". (4) 


To see why the order is reversed, multiply AB times B~'A™!. Inside that is BB~1 = [: 
Inverseof AB  (AB)(B~'A~')=AIA7'=AA'=I. 


We moved parentheses to multiply BB! first. Similarly B~'A~! times AB equals J. 
This illustrates a basic rule of mathematics: Inverses come in reverse order. It is also 
common sense: If you put on socks and then shoes, the first to be taken off are the 
The same reverse order applies to three or more matrices: 


Reverseorder (ABC)~1=C~!B-1A7?. (5) 


Example 2 Inverse of an elimination matrix. If E subtracts 5 times row | from row 2, 
then E~! adds 5 times row 1 to row 2: 


i 0 @ 
ones 0| and E=/5 1 0 
1 00 1 
Multiply EE~! to get the identity matrix J. Also multiply E~1E to get J. We are adding 
and subtracting the same 5 times row 1. Whether we add and then subtract (this is EE~') 
or subtract and then add (this is E”'E), we are back at the start. 
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For square matrices, an inverse on one side is automatically an inverse on the other side. 


If AB = I then automatically BA = I for square matrices. In that case B is A~!. This is 
extremely useful to know but we are not ready to prove it. 


Example 3 Suppose F subtracts 4 times row 2 from row 3, and F'~! adds it back: 


1 0 0 10 0 
F={0 1 O| and F-!}=/]0 1 O 
0’ =4 1 041 


Now multiply F by the matrix E in Example 2 to find FE. Also multiply E~! times F~! 
to find (F.E)~'. Notice the required order (FE)~' = E~!F—! for the inverses. 


1 0 0 Ta Oa 8) 
Right ord 
1g on cr FE= —5 i 0 and E-iF7} aa On OM) , (6) 
Good inverse 
20 -4 1 On 4 1 


The result is beautiful and correct. The product F’E contains “20” but its inverse doesn’t. 
F subtracts 5 times row 1 from row 2. Then F' subtracts 4 times the new row 2 (changed 
by row 1) from row 3. In this order F E, row 3 feels an effect from row 1. 

In the order H~! F'!, that effect does not happen. First F’~! adds 4 times row 2 to row 3. 
After that, E~! adds 5 times row 1 to row 2. There is no 20, because row 3 doesn’t change 
again. In this order E~1F —1, row 8 feels no effect from row 1. 


E~1F—! is quick. The multipliers 5, 4 fall into place below the diagonal of 1’s. 


Calculating At by Gauss-Jordan Elimination 


I hinted that A~! might not be explicitly needed. The equation Av = b is solved by 
v = A-‘b. But it is not necessary or efficient to compute A~! and multiply it times b. 
Elimination goes directly to v. Elimination is also the way to find A~', as we now show. 


The Gauss-Jordan idea is to solve AA~! = J. Find each column of A7~?. 


A multiplies the first column of A~! (call that v;) to give the first column of I (call 
that e;). This is our equation Av; = e; = (1,0,0). There will be two more equations. 
Each of the columns v1, v2, v3 of A~! is multiplied by A to produce a column of J: 


3 columns of A~1 AATt= Alv, v2 v3] =[e1 e2 e3] =I. (7) 


To invert a 3 by 3 matrix A, we have to solve three systems of equations: Av; = e; and 
Av2 = e2 = (0,1,0) and Av3 = e3 = (0,0, 1). Gauss-Jordan finds A~! this way. 
The Gauss-Jordan method computes A~! by solving all n equations together. 
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Usually the “augmented matrix” [A 6] has one extra column b. Now we have three right 
sides (the columns of [). So the augmented matrix is the block matrix [A J]. 


— 1 
: ea f Start Gauss-Jordan on [A I | 


[A e; es es]}=] -1 2 -1 0 1 O 


a Ok 0 
+|/o 2-1 $ 1 O (3 row 1 + row 2) 
Oi DL fA ee a, 
2b Ded 0.0 
(0 3 =P oF. 2b 0 
0 of # 21 (2 row 2 + row 3) 


We are halfway to A~'. The matrix in the first three columns is U (upper triangular). 
The pivots 2, 3 4 are on its diagonal. Gauss would finish by back substitution. Jordan’s 
idea is to continue with elimination! He goes all the way to the identity matrix. 

Rows are subtracted from rows above, to produce zeros above the pivots: 


Z b 2 -l1 0 al 0 0 
ero above a a ee 
- Oo 5s O £5 Ef 8 3 
( third pivot ) z i A : 2 ; (3 row 3 + row 2) 
3 38 38 
Zero above 2 0 Gin ue 1 1 5 
aenivot =e 2 2 (2 row 2 + row 1) 
second pivo eae ea ee 
2 43. a 
4 1 2 
Oe le Saas ce aie 


The last Gauss-Jordan step is to divide each row by its pivot. The new pivots are 1. 
We have reached J in the first half of the matrix, because A is invertible. 
The three columns of A~' are in the second half of [I A~*|: 


(divide by 2) 100% ¢ # 
divide by 3 = [I 2 @3|=(I Aq]. 
pa 01 0a [I a x2 23] =[ | 
1vide ae 
a3 0 0 1 4 5 & 


Starting from the 3 by 6 matrix [A J], we ended with [J A+]. Here is the whole 
Gauss-Jordan process on one line for any invertible matrix A: 


Gauss-Jordan Multiply |A I] by A7* toget [I A~+). 


The elimination steps create the inverse matrix while changing A to J. For large matrices, 
we probably don’t want A~? at all. But for small matrices, it can be very worthwhile to know 
the inverse. We add three observations about this particular A~+ because it is an important 
example. We introduce the words symmetric, tridiagonal, and determinant : 
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1. A is symmetric across its main diagonal. So is A7!. 


2. A is tridiagonal (only three nonzero diagonals). But A~? is a full matrix with 
no zeros. That is another reason we don’t often compute inverse matrices. The 
inverse of a sparse matrix is generally a full matrix. 


3. The product of pivots is 2(3)($) = 4. This number 4 is the determinant of A. 


i ey a | 
1 
A" involves division by the determinant At= r 24 2]. (8) 
12 3 


This is why an invertible matrix cannot have a zero determinant. 


Example 4 Find A~! by Gauss-Jordan elimination starting from A = [23]. There are 
two row operations and then a division to put 1’s in the pivots : 


2 a ee Bs ® igo sae M0 me = 

[A I] =F sae - is ans | (this is [U L~1]) 
2 0 7 -3 eG —3) ae = 

=|5 oe 1 [5 | 3 3) (isis (7 4-9). 


That A~? involves division by the determinant ad — bc = 2-7—3+4 = 2. The matrix 
A must be invertible, or elimination cannot reduce it to J (in the left half of Re Aq} ibs 


Gauss-Jordan shows why A~! is expensive. We must solve n equations for its n columns. 
To solve Av = b without A—1, we deal with one column b to find one column v. 


In defense of A~!, we want to say that its cost is not n times the cost of one system. 
Surprisingly, the cost for n columns is only multiplied by 3. This saving is because the n 
equations Av; = e; all involve the same matrix A. Working with the right sides is relatively 
cheap, because elimination only has to be done once on A. 

The complete A~? needs n° elimination steps, where one equation needs n3/3. 


Singular versus Invertible 


We come back to the central question. Which matrices have inverses? The start of this 
section proposed the pivot test: A} exists exactly when A has a full set of n pivots. 
(Row exchanges are allowed.) Now we can prove that by Gauss-Jordan elimination : 


1. With n pivots, elimination solves all the equations Av; = e;. The columns v; go into 
A-!. Then AA~! = J and A7! is at least a right-inverse. 


2. Elimination is really a sequence of multiplications by E’s and P’s and D7: 


Left-inverse of A (Dt. B Pe RAST, (9) 
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D~! divides by the pivots. The matrices E produce zeros below and above the pivots. 
Permutations P will exchange rows if needed. The product matrix in equation (9) is 
a left-inverse. With n pivots we have reached A~'A = I. 

The right-inverse equals the left-inverse. That was Note 2 at the start of in this section. 
So a square matrix with a full set of pivots will always have a two-sided inverse. 

Reasoning in reverse will now show that A must have n pivots if AC = I. (Then we 
deduce that C is also a left-inverse and CA = J.) Here is one route to those conclusions: 


1. If A doesn’t have n pivots, elimination will lead to a zero row. 
2. Those elimination steps are taken by an invertible M. So a row of MA is zero. 


3. If AC = I had been possible, then M AC’ = M. The zero row of V/A, times C, gives 
a zero row of M itself. 


4. An invertible matrix M can’t have a zero row! So A must have n pivots if AC = I. 
That argument took four steps, but the outcome is short and important. 


Elimination gives a complete test for invertibility of a square matrix. A~1 exists (and 
Gauss-Jordan finds it) exactly when A has n pivots. The argument above shows more: 


If AC=I then CA=I and C=A7! 


Example 5 = Here L is lower triangular with 1’s on the diagonal. Then L~' is too. 
A triangular matrix is invertible if and only if no diagonal entries are zero. 


Here L has 1’s so L~! also has 1’s. Use the Gauss-Jordan method to construct L~!. 
Start by subtracting multiples of pivot rows from rows below. Normally this gets us 
halfway to the inverse, but for L it gets us all the way. L~' appears on the right when 
I appears on the left. Notice how L~! contains 11, from 3 times 5 minus 4. 


10010 0 
mrieugiay Ds || e100 eet Olid 
: 4.6 0. 02 
LOR OS Ae 2 2x0): (3 times row | from row 2) 
+1010 -3 1 0 (4 times row 1 from row 3) 
>{0 5 1 -4 0 1 (then 5 times row 2 from row 3) 
Ate 40* 3.0 1 0 0 
eth) FL: ae Ot oe 1B es 
0! 0 el, += 5> 21 


L goes to I by a product of elimination matrices 323; /2,. So that product is Gey 
The 11 in L~ does not come into L, to spoil 3, 4,5 in the good order Be ba Ege: = eh 
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= REVIEW OF THE KEYIDEAS #® 


1. The inverse matrix gives AA~! =I and AT'A=TI. 
2. A is invertible if and only if it has n pivots (row exchanges allowed). 


3. If Av = O for a nonzero vector v, then A has no inverse. 


4. The inverse of AB is the reverse product B~1A~?. And (ABC)-! = C-!B-1A7}. 


5. The Gauss-Jordan method solves AA~! = J to find the n columns of A~!. The 
augmented matrix [A I] is row-reduced to [I A7*]. 


Problem Set 4.4 


Find the inverses of A, B, C’ (directly or from the 2 by 2 formula): 


03 20 3.4 
and B= and C= > 
40 4 2 DF 


For these “permutation matrices” find P~! by trial and error (with 1’s and 0’s): 


00 1 01 0 
P=({01 0 and P=|00 1 
10 0 10 0 


Solve for the first column (z, y) and second column (t, z) of A~!: 


allel) [ool ]=[i 


Show that | 4 2] is not invertible by trying to solve AA~* = J for column 1 of A~!: 


Sle 


Find an upper triangular U (not diagonal) with U? = I which gives U = U~}. 


1 For a different A, could column 1 of A~+ 
0 be possible to find but not column 2? 


(a) If A is invertible and AB = AC, prove quickly that B = C. 
(b) If A = [}}], find two different matrices such that AB = AC. 
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10 


11 


12 


13 


14 


15 


16 


17 


18 


(Important) If A has row 1 + row 2 = row 3, show that A is not invertible: 


(a) Explain why Av = (1,0, 0) cannot have a solution. 
(b) Which right sides (6, 62, 63) might allow a solution to Av = b? 


(c) What happens to row 3 in elimination? 
If A has column | + column 2 = column 3, show that A is not invertible : 


(a) Find a nonzero solution x to Az = O. The matrix is 3 by 3. 


(b) Elimination keeps column | + column 2 = column 3. Why is no third pivot ? 


Suppose A is invertible and you exchange its first two rows to reach B. Is the new 
matrix B invertible and how would you find B~! from A~!? 


Find the inverses (in any legal way) of 


0.20". 0.2 ode OOO 

0 3 0 43 00 
A= : and B= 

0 4 6 5 

00 00 76 


(a) Find invertible matrices A and B such that A + B is not invertible. 
(b) Find singular matrices A and B such that A + B is invertible. 


If the product C = AB is invertible (A and B are square), then A itself is invertible. 
Find a formula for A~! that involves C~! and B. 


If the product MM = ABC of three square matrices is invertible, then B is invertible. 
(So are A and C.) Find a formula for B~! that involves M~1 and A and C. 


If you add row 1 of A to row 2 to get B, how do you find B~! from A~!? 


10 
i oe 


A is 


Notice the order. The inverse of B= 


Prove that a matrix with a column of zeros cannot have an inverse. 


d —b 


Multiply [28 | times [_¢ ~2]. What is the inverse of each matrix if ad 4 bc? 


(a) What 3 by 3 matrix & has the same effect as these three steps? Subtract row | 
from row 2, subtract row 1 from row 3, then subtract row 2 from row 3. 

(b) What single matrix L has the same effect as these three reverse steps? Add row 
2 to row 3, add row | to row 3, then add row | to row 2. 


If B is the inverse of A?, show that AB is the inverse of A. 
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19 (Recommended) A is a 4 by 4 matrix with 1’s on the diagonal and —a, —b, —c on the 
diagonal above. Find A~! for this bidiagonal matrix. 


20 ~—‘- Find the numbers a and 6 that give the inverse of 5 * eye(4) — ones(4,4) : 


-1 


7 ee 9 ees ea a b bd 
paeace se a | Me a ee 
So Via= Abeer mec bb a b 
i Ul weave sf bb ba 


What are a and 0 in the inverse of 6 * eye(5) — ones(5,5) ? In MATLAB, J = eye. 
21. ~— Sixteen 2 by 2 matrices contain only 1’s and 0’s. How many of them are invertible? 


Questions 22-28 are about the Gauss-Jordan method for calculating A+. 


22 Change J into A~! as you reduce A to I (by row operations) : 


apa n=) sake : 


379) 80.1 


23 ~~ Follow the 3 by 3 text example of Gauss-Jordan but with all plus signs in A. 
Eliminate above and below the pivots to reduce [A I]|to[I A7']: 


Both “Odes 020 
[AN ee ee As a, <1 
Ua BR ae ta 


24 Use Gauss-Jordan elimination on [U J] to find the upper triangular U~? : 


la b 1 0 0 
UU '=I1 Ole 2, @ #x3/=1)/0 1 0 
00 1 a 0 


25 Find A~! and B™! (if they exist) by eliminationon[A I] and[B I]: 


211 gl el 
A=|12 1] and B=|-1 2 -1 
Ll1 2 el 2 


26 What three matrices Ey; and Fiz and D7! reduce A = E 1 to the identity 
matrix? Multiply D~' Ey2 Fo, to find A7?. 
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27 


28 


29 


30 


31 


32 


33 


34 


Invert these matrices A by the Gauss-Jordan method starting with [A J]: 


1:0 0 1 Tt 4 
A= 12 1 3 and A=j]12 2 
00 1 LT x2 > 483 


Exchange rows and continue with Gauss-Jordan to find A! : 
Oe 30 
parsers 
22 72% Osa 
True or false (with a counterexample if false and a reason if true) : 


(a) A 4 by 4 matrix with a row of zeros is not invertible. 
(b) Every matrix with 1’s down the main diagonal is invertible. 


(c) If A is invertible then A~! and A? are invertible. 


For which three numbers c is this matrix not invertible, and why not? 


2c¢ce 
A=/|¢ c Cc}. 
8 7 cl 


Prove that A is invertible if a 4 0 and a ¥ b (find the pivots or A~'): 


a b b 
A=l/aa b 
aaa 


This matrix has a remarkable inverse. Find A~! by elimination on[ A J]. Extend to 
a5 by 5 “alternating matrix” and guess its inverse; then multiply to confirm. 


1-1 1-1 1 

0 —1 1 
Invert A = and solve Av = 

OP 0 1-1 1 

0 O Oeaeel 1 


(Puzzle) Could a 4 by 4 matrix A be invertible if every row contains the numbers 
0,1, 2,3 in some order? What if every row of B contains 0, 1, 2, —3 in some order? 


Find and check the inverses (assuming they exist) of these block matrices : 
iO A 0 0 TL 
C. J C D i Di 
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4.5 Symmetric Matrices and Orthogonal Matrices 


This section introduces the transpose of a matrix. Start with any m by n matrix A. Then 
the rows of A become the columns of AT (called “A transpose”). The columns of A are the 
rows of AT. The m by n matrix is flipped across its main diagonal. Then A™ is n by m. 


1 0 
Transpose If A= a eas then AT=]| 2 0 
0 0 5 6 5 


The entry in row 7, column j of AT comes from row j, column i of A. So (AT):; = Aji. 


The transpose of a lower triangular matrix is upper triangular. Two key rules: 


Products AB —_—‘ The transposeof AB is (AB)? = BTAT (1) 


Inverses A~! —_— The transpose of A~! is (A71)T = (AT). (2) 


Notice especially how B™ AT comes in reverse order. For inverses, this reverse order is quick 
to check: B~!A7! times AB produces B~!(A~'A)B = I. For transposes, rules (1) and 
(2) are tested and explained in the problem set. We want to move to the essential matrices of 
this section because they are the most important matrices in mathematics: 


Symmetric matrices ___A™ equals A. Then A is square and a;; = ai. 


Orthogonal matrices _A™ equals A~!. Then A is square and A‘ A = I. 
Here is asymmetric example S and also an orthogonal example Q : 


4 6 


eeeeisihls ind _ | cos@ —sin@ 
Symmetric S = | Orthogonal Q = | ag seas eo | 


Symmetry of S is easy to see: 4 = 4. For orthogonality I will check that QTQ = I: 


Columns are orthogonal cos@ sin@ | |cos@ —sinO}  |1 0 (3) 
Columns are unit vectors —sin@ cos] | sin@ COs G3)') > 08 Ae] 5 


Those words at the left tell you the key facts about the columns q, and q,: 


i hy if Wy aly Toe0 
va ae | “| [a a: | =| oe eal ie (4) 
os E es 939, 92% 0 1 


Off the diagonal you see gig, = 0 and giq, = 0. The columns are orthogonal vectors. 
On the diagonal gi q, = 1 and q3.qy = 1. The q’s are unit column vectors: length 1. 
Symmetric matrices will have the special letter S and orthogonal matrices will be Q. 
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Symmetric Matrices S = ATA 


The full glory of symmetric matrices comes with their eigenvalues \ and eigenvectors x. 
Those strange words, half German and half English, are at the heart of Chapter 6. You 
will see the key equation Ax = Azx (this puts Ax in the same direction as x). Let me 
write here only two facts that show why symmetric matrices are special : 


Sa = Ax Symmetric matrices have real eigenvalues \ and orthogonal eigenvectors x. 


Those facts will be crucial in solving symmetric systems y’ = Sy and y” + Sy = 0. 

It is equally important to know where symmetric matrices come from. One part of applied 
mathematics and engineering mathematics is solving equations. We have solved Av = b 
and we will soon solve dy/dt = Ay. Solving is one half of our subject, the other half is 
discovering the equations in the first place. 

Start with a physical or biological or economic problem. Model it by equations. 
Solving F = ma and e = mc? may take thought, but we give first place to Newton and 
Einstein for discovering those equations. 

To repeat: Where do symmetric matrices come from? In my experience, you start with 
a matrix A. Often this matrix is rectangular (m by n). Its transpose is also rectangular 
(AT is n by m). Sooner or later, you are almost sure to see the matrix ATA. At that moment 
you have a square symmetric n by n matrix : 


S = ATA is always symmetric. Its transpose is ST =(ATA)T=ATATT=S. (5) 


This matrix AT A is automatically square, because (n by m) times (m by n) is (n by n). 


Bea | eco ae pes eo 
Sia 16 


Example 1 ATA= | 


The number 12 comes twice in ATA. It is (row 1 of A‘) - (column 2 of A) and also 
(row 2 of AT) - (column 1 of A). The numbers 11 and 16 on the diagonal are dot products 
of a column with itself. So they give the length squared of the columns. These diagonal 
entries of ATA cannot be negative. 


Comment. Since A is 3 by 2, the system Av = b has three equations but only 
two unknowns v; and v2. Almost surely there will be no solution. But if those numbers 
bi, be, bs came from careful and expensive measurements, we cannot say “no solution” 
and stop. We want to find the “best solution” or “closest solution” to Av = b. 

In practice we usually choose the vector U that makes A¥ as close as possible to b. 
The error vector e = b — A® is as short as possible. We are minimizing ||e||? = eTe, 
the squared length of the error. The best vector ¥ is the least squares solution. 

In Section 7.1, minimizing the error is a calculus problem and also a linear algebra 
problem. Both approaches lead to the equation AT A& = ATb. The best @ involves AT A. 
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Difference Matrices 


I want to show you larger examples of AT A that are truly important. Start with a backward 
difference matrix A. It can have n + 1 rows and n columns. Here n = 3: 


1 V1 
Difference matrix —1 1 Vo — V} 


Differences of v's A= —E 4 Av = v3 — Ve (6) 


That vector Av in linear algebra corresponds to the derivative dv/dzx in calculus. 
You see backward differences Av = [ v(z) —v(a— Ax) |/Ae in calculus. This is 
before the stepsize Ax approaches zero and Av/Az approaches du/dz. 

More often you see forward differences | v(a + Ax) — v(x) ]/Az, where the small Ax 
goes forward from x. Those appear in linear algebra when we transpose the matrix A. 
But first differences are “anti-symmetric” and AT will be minus a forward difference. 
So the vector A? w corresponds to the derivative —dw/dz : 


fo W1 — We 
AT = t= Alw = | Wo - W3 (7) 
1 -l 


3 by 4 matrix 
Differences of w’s 
W3 — W4 


Now comes the symmetric matrix S$ = ATA. It will be 3 by 3. Since A and A™ are 
“first differences” with 1 and —1, AT A will be a second difference matrix with —1, 2, —1: 


2 —1 0 2 V1 _— v2 
Second differences S=]|-1 2 -1 Sv= |-—u +2v2—-— v3} (8) 
Q -l 2 — U2 + 23 
The main diagonal of S has 2’s, because each column of A produces 1? + (—1)? = 2. 


The subdiagonal and superdiagonal of S have —1’s, because this is the dot product of a 
column of A with the next column. 


Let me admit quietly that S is my favorite matrix. You are seeing the 3 by 3 version, 
what I really like is n by n. Chapter 7 makes the link with calculus, where the first derivative 
of the first derivative is the second derivative : 


d?v u(z + Ax) —2v(2)+u(2— Az) d?v 
Sv corresponds to ——— st 9 

Vv p dea (Ax)? dx2 ( ) 
All of Chapter 2 was about second order equations involving y’’. Newton’s Law F' = ma 
puts second derivatives (the acceleration a) at the heart of physics. When springs 
oscillate, and when current goes through a network, this matrix S$ = ATA will appear. 


The truth is that we need to know everything about S—its pivots, its determinant, 
its inverse, its eigenvalues, its eigenvectors. We will. 
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The matrix L = AA? is almost as important. Please recognize that L is also symmetric, 
but L is different from S. When A has n columns and n + 1 rows, S = ATAisn by n. 
But L = AAT is square of sizen + 1. Wekeepn = 3andn+1=4: 


1 -l 
Second differences in DL eerie —1 21 
New boundary conditions ee -1 2 -1 uo 


= 1 


This matrix has no inverse! Can you see a vector w that has Lw = 0? It is the vector 
of all ones, w = (1,1,1,1). Each row of L adds to zero and that will produce Lw = 0. 


Permutation Matrices 


A quick way to produce orthogonal matrices is to use the columns of the identity matrix. 
In any order, the columns of J are orthonormal. The new order is called a “permutation” 
of the original order. So the new matrix is called a permutation matrix. 

Important: We could put the rows of J into the new order. That also produces a permu- 
tation matrix. If this row exchange matrix is P, then the column exchange matrix is PT. 
You can see the transpose in this 3 by 3 example starting from J : 


: G 1. @ 3 Onc Onl 
reer (GP a] Gimme.) 1 0 0) in 
Meat 1 0 0 ay Ok 6 


When P multiplies a vector v, it puts the components of v in the new order y, z, x. 
Then P™ puts them back in the original order 2, y, z: 


x y y x 
Py ae lee | a2 and PT| z/=| y 
z i x z 


These are orthogonal matrices, so P~! is the same as P?. Then PTP = PP™ = J. 


We can complete the list of all 3 by 3 permutation matrices (including the identity ma- 
trix itself, which exchanges nothing: the identity permutation). The other permutations 
exchange two rows or two columns of J. There are P and PT in ( 11), and four more. 


Altogether 6 permutation matrices when n = 3. And n! permutation matrices of size n. 
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The effect of P;2 is to exchange (permute) rows 1 and 2, when we multiply P;2A or P,2b. 


rowlofA row 2 of A by be 
Pig row 20f A} = | rowlof A Pye be oa by 
row 3 of A row 3 of A bs b3 


This is exactly what we do in elimination, when a zero appears in the first pivot position. 
If aj, = 0 and ag; # 0, Pig exchanges rows to produce a nonzero pivot. 


Elimination by matrices Eliminate by E,;, exchange rows by Px. 


The elimination matrix F,; subtracts a multiple ¢;; of row 7 from a lower row 2 > j. 
Before that, a permutation matrix P;, may put row k into row j, to produce a better number 
(a larger number) in the pivot position. 

We must use P;x, to get a nonzero pivot. We may use P;;, to get a larger pivot. The 
LAPACK code (open source) chooses the largest available number as the pivot. The 
jth pivot (in column 7) will be the largest number in row 7 or below. LAPACK is the foun- 
dation for the linear algebra part of many important software systems, including MATLAB. 


Orthogonal Matrices 


When A has orthogonal columns, the symmetric matrix A™ A is diagonal. The off-diagonal 
entries are dot products of different columns of A, so they are all zero. 


When the columns of A are unit vectors (length 1), all diagonal entries of ATA are 1. 
Those entries are (row i of A‘) - (column i of A) = length squared = 1. Dot products 
of columns with themselves are on the main diagonal of ATA. 


The best case is orthonormal columns. Those are orthogonal unit vectors, both 
properties at the same time. In this case we write q for the vectors and Q for the matrix: 


‘4 10 0 
Orthogonal gq/q;=0 . qi 
Walk soc tT, 17 QTQ=] + | |a---gn]=]0 1 Of. (12) 
nit vectors q;q; = a cm 4 


When Q is square, I call it an orthogonal matrix. (The name “orthonormal matrix” 
might have been better.) I still use the letter Q when the matrix is rectangular, with 
m > n. But a rectangular Q” is only a left-inverse of Q: 


(m=n) Q7Q=QQ?=I-  (m>n) Q*Q=I1 but QQTFI. (13) 


Q™Q = I is a very powerful property. When we multiply any vector by Q, its length 
will not change: 


Same length ||Qv|| = ||v|| for every vector v. (14) 


The proof comes directly from ||Qv||? = (Qv)"(Qv) = v'Q™Qv. The matrix QTQ is 
the identity. So we are left with vy? v = ||v||?. 
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The fact that lengths don’t change makes orthogonal matrices very safe to compute with. 
Nothing blows up, nothing becomes too small (no overflow and no underflow). 
The basic computation in linear algebra is the solution of a linear system, and for 
(square) orthogonal matrices this is incredibly easy : 


Q-1=qQT The solution of Qu = b is v = Q'b. (15) 


To solve the equations, we just transpose the matrix. The greatest example is the Fourier 
matrix, which breaks up a signal b into separate pure frequencies. The vector b in the time 
domain is transformed to v in the frequency domain. The “energy” can be measured in either 
domain, because ||b||? is equal to ||v||?—as we saw above. 

The Fourier matrix F' is exceptional because multiplications by F and F7! are 
extremely fast. They break up into diagonal matrices and permutation matrices. This is 
the insight behind the Fast Fourier Transform. (The FFT is in Section 8.2.) 

The equation Qv = b has aclear geometrical meaning when Q is 2 by 2. Qv is ex- 
pressing that vector b as a combination of the columns of @. Those columns q,, qo give the 
perpendicular axes in Figure 4.9. We are finding the component of b in each direction. 

Those two components are v; = q, - band v2 = qo - b. Solving Qu = bby v = Q™b 
is just a change from 2, y axes to q1, gz axes. 


q2 


Figure 4.9: Every b = (2, y) splits into b = v1q, + v2qo. And ||b||? = 2? + y? = v? + v3. 


Both Symmetric and Orthogonal 


Symmetric matrices are the best, they are everywhere in applied mathematics. Orthogonal 
matrices are a strong second, starting with rotation matrices and the Fourier matrix. Most 
symmetric matrices are not orthogonal and most orthogonal matrices are not symmetric. 
It is natural to wonder when and if we can have both properties at once. 

Exchange and reflection and “Hadamard” matrices are symmetric and orthogonal: 


—l if 1 1 

0 1 —cos@ sin@ 1 1 -1 1 iL 
P=|i | R=| sin 0 =A oe 1 1 -l 1 oy 

fil 1 1 -1 


Notice that the columns of H are unit vectors: }((—1)? + 1? + 1? + 1?) = 1. Nobody 
knows which dimensions allow n orthogonal vectors of 1’s and —1’s (not odd dimensions !). 
Wikipedia describes this unsolved problem on its “Hadamard matrix” page. 
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To find more symmetric orthogonal matrices, and eventually all of them, we can use 
an important fact about orthogonal matrices : 


If Q, and Q2 are orthogonal, so is their product Q = Q1Q2. 


The test is always to check Q?'Q = I. Here this is (Q1Q2)7(Q1Q2) - OF OT O1O5. 
In the middle is Q'Q, = I. Then the outside has QF Qo = I. 


Conclusion: We can multiply orthogonal matrices and stay orthogonal. 


Problem: We can’t always multiply symmetric matrices and stay symmetric. 


Here is one approach that succeeds with both properties. Start with any diagonal matrix 
D of 1’s followed by —1’s: 


Symmetric and orthogonal D = diag (1,...,1,—1,...,—1). (17) 


Multiply D on the left side by any orthogonal Q and on the right side by Q?. That 
“symmetric multiplication” keeps the matrix QDQ™ symmetric: 


Symmetric and orthogonal (QDQ')T = Q™TD™Q™ = QDAQ'. (18) 


This product of orthogonal matrices is also orthogonal. When you meet eigenvalues in 
Chapter 6, you will see that all symmetric orthogonal matrices have this form QDQ?. 
Possibly that small fact is appearing for the first time in a textbook. 


Factoring a Matrix 


That was for fun, this is more important. “A symmetric matrix S is like a real number r.” 
“An orthogonal matrix Q is like a complex number e’® with absolute value 1.” Every 
complex number can be written in polar form re”’, and what we hope for is true: 


Every square matrix A can be written in polar form A = SQ. 
A = SQ is equivalent to the Singular Value Decomposition (this is explained in 


Section 7.2). The SVD is the last and most remarkable step in the Fundamental Theorem 
of Linear Algebra. The polar form is in the Chapter 7 Notes. 


= REVIEW OF THE KEY IDEAS #® 


1, The transpose has Aj, = Aji. Then (AB)? = BT AT and Av - w equals v - ATw. 
. Symmetric matrices have ST = S. Orthogonal matrices have QT = Q-!. 
. ATA is always a symmetric matrix. Key examples are second difference matrices. 


. The columns of Q are orthogonal vectors of length 1. Then ||Qa|| = ||a|| for all a. 


nan &» ww WN 


. The n! permutation matrices P reorder the rows of I (n by n), andP? = P-!. 
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Problem Set 4.5 


Questions 1-9 are about transposes AT and symmetric matrices S = ST. 


1‘ Find AT and A“! and (A7!)? and (AT)~? for 


1 O ec 
A=|4 3 | and also A=[} al 


2 (a) Find 2 by 2 symmetric matrices A and B so that AB is not symmetric. 


(b) With AT = A and BT = B, show that AB = BA ensures that AB will now 
be symmetric. The product is symmetric only when A commutes with B. 


3 (a) The matrix ((AB)~1)? comes from (A~!)? and (B~1)?. In what order? 
(b) If U is upper triangular then (U1)? is triangular. 


4 Show that A? = 0 is possible but AT A = 0 is not possible (unless A = zero matrix). 


5 Every square matrix A has a symmetric part and an antisymmetric part : 
: A+A™ A-— AT 
A = symmetric + antisymmetric = 5 + 5 : 


Transpose the antisymmetric part to get minus that part. Split these in two parts: 


1 4 8 
A= E 4 A=|0 2 6 
0 0 3 
6 The transpose of a block matrix M = [AB] is MT = . Test an example 


to be sure. Under what conditions on A, B, C,, D is the block matrix symmetric? 
7 True or false: 


(a) The block matrix | 8 ¢ | is automatically symmetric. 
(b) If A and B are symmetric then their product AB is symmetric. 
(c) If A is not symmetric then A~! is not symmetric. 


(d) When A, B,C’ are symmetric, the transpose of ABC is CBA. 


8 (a) How many entries of S can be chosen independently, if S = ST is 5 by 5? 


(b) How many entries can be chosen if A is skew-symmetric ? (AT = —A). 


9 Transpose the equation A~!.A = I. The result shows that the inverse of A” is 
If S is symmetric, how does this show that S'— is also symmetric ? 
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Questions 10-14 are about permutation matrices. 


10 


11 


12 


13 


Why are there n! permutation matrices of size n? They give n! orders of 1,...,n. 


If P, and P> are permutation matrices, so is P; Pj. This still has the rows of J in some 
order. Give examples with P, P2 # P)P, and P3P, = P;P3. 


There are 12 “even” permutations of (1,2,3,4), with an even number of exchanges. 
Two of them are (1,2, 3,4) with no exchanges and (4, 3,2, 1) with two exchanges. 
List the other ten. Instead of writing each 4 by 4 matrix, just order the numbers. 


If P has 1’s on the antidiagonal from (1,7) to (n, 1), describe PAP. Is P even? 


(a) Find a3 by 3 permutation matrix with P? = J (but not P = J). 
(b) Find a 4 by 4 permutation with P* + J. 


Questions 15-18 are about first differences A and second differences A‘ A and AA’. 


15 


16 


17 


18 


Write down the 5 by 4 backward difference matrix A. 


(a) Compute the symmetric second difference matrices § = ATA and L = AA’. 
(b) Show that S is invertible by finding S~!. Show that L is singular. 


In Problem 15, find the pivots of S and L (4 by 4 and 5 by 5). The pivots of S in 
equation (8) are 2,3/2, 4/3. The pivots of L in equation (10) are 1, 1, 1, 0 (fail). 


(Computer problem) Create the 9 by 10 backward difference matrix A. Multiply to 
find S = ATA and L = AA’. If you have linear algebra software, ask for the 
determinants det(S) and det(L). 


Challenge : By experiment find det(S) when S = ATA is n by n. 


(Infinite computer problem) Imagine that the second difference matrix S' is infinitely 
large. The diagonals of 2’s and —1’s go from minus infinity to plus infinity: 


Infinite tridiagonal matrix = = a = 4 
(a) Multiply S' times the infinite all-ones vector v = (...,1,1,1,1,...) 
(b) Multiply S times the infinite linear vector w = (...,0,1,2,3,...) 
(c) Multiply S' times the infinite squares vector u = (...,0,1,4,9,...). 
(d) Multiply S times the infinite cubes vector c = (...,0,1,8,27,...). 


The answers correspond to second derivatives (with minus sign) of 1 and x? and x. 
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Questions 19-28 are about matrices with Q'Q = I. If Q is square, then it is an 
orthogonal matrix and QT = Q-1 and QQ™ = I. 


19 Complete these matrices to be orthogonal matrices : 
= i sa 
an | alf2 1) x a2 | 2 1 
Z 
lL =1 
20 (a) Suppose Q is an orthogonal matrix. Why is Q~! = Q? also an orthogonal 
matrix ? 
(b) From QTQ = J, the columns of Q are orthogonal unit vectors (orthonormal 
vectors). Why are the rows of @ (square matrix) also orthonormal vectors ? 

21 (a) Which vectors can be the first column of an orthogonal matrix ? 

(b) If QTQ, = I and Q?Q2 = J, is it true that (Q1Q2)"(Q1Qz2) = I? Assume 
that the matrix shapes allow the multiplication Q; Qo. 

22 ~=—sIf wis a unit column vector (length 1, wu = 1), show why H = I — 2uu? is 

(a) asymmetric matrix: H = HT (b) an orthogonal matrix: HTH = I. 

23 If uw = (cos0,sin@), what are the four entries in H = J — 2uu'? Show that 
Hu = —uand Hv = v for v = (—sin6@,cos@). This H is a reflection matrix : 
the v-line is a mirror and the w-line is reflected across that mirror. 

24 Suppose the matrix Q is orthogonal and also upper triangular. What can @ look like ? 
Must it be diagonal ? 

25 (a) To construct a 3 by 3 orthogonal matrix Q whose first column is in the direction 

w, what first column q, = cw would you choose ? 

(b) The next column q, can be any unit vector perpendicular to q,. To find qs, 
choose a solution v = (v1, v2, U3) to the two equations gq? v = 0 and gs vu = 0. 
Why is there always a nonzero solution v ? 

26 = Why is every solution v to Av = 0 orthogonal to every row of A? 

27 Suppose QQ = I but Q is not square. The matrix P = QQ? is not J. But show that 
P is symmetric and P? = P. This is a projection matrix. 

28 A 5 by 4 matrix Q can have Q?Q = I but it cannot possibly have QQ? = I. 


Explain in words why the four equations Q™v = O must have a nonzero solution v. 
Then v is not the same as QQ? v and J is not the same as QQT. 
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29 Can you find a rotation matrix Q so that QDQ" is a permutation ? 
cos@ —sin@| |} 1 cos@ sin@ 1 0 1 
sin @ cos 0 —1]|-sin@ cosé ss as 1 Ou 
30 = Split an orthogonal matrix (QTQ = QQ™ = J) into two rectangular submatrices : 


o ge T € 
Q=(Q11Q] ma Qt9=|510' Grd} 
(a) What are those four blocks in Q™Q = I? 


(b) QQT = Qi QF + Q2QT = I is column times row multiplication. Insert 


i i and do the same multiplication for QDQ"™. 


the diagonal matrix D = | 0 —I 


Note: The description of all symmetric orthogonal matrices S in (18) becomes 
S=QDQ™ = Qi QT — Q2Q7. This is exactly the reflection matrix J — 2Q2Q7. 


31 The real reason that the transpose “‘flips A across its main diagonal” is to obey 
this dot product law: (Av) +-w = v+(Atw). That rule (Av)'w = v"(ATw) 
becomes integration by parts in calculus, where A = d/dx and AT = —d/dz. 


(a) For 2 by 2 matrices, write out both sides (4 terms) and compare: 


a bi ily alee its if Or a cl] wy 
c d||ve we ih acs v2 b d||we|/)’ 
(b) The rule (AB)? = B™ A™ comes slowly but directly from part (a) : 


(AB) v - w= A(Bv)- w= Bu- Atw=v- B'(Alw)=0- (BTAT)w 


Steps 1 and 4 are the law. Steps 2 and 3 are the dot product law. 


32 How is a matrix S = S7™ decided by its entries on and above the diagonal ? 
How is Q with orthonormal columns decided by its entries below the diagonal ? 
Together this matches the number of entries in an n by n matrix. So it is reasonable 
that every matrix can be factored into. A = SQ (like re’’). 
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= CHAPTER 4 NOTES #8 


Important Question Where do the rules for matrix-matrix multiplication AB come from? 
Answer From matrix-vector multiplication Av. The matrix AB is defined so that 


AB times v equals A times Bu. Then AB times C' equals A times BC. 


Key idea: Choose the special vector v = (1,0,...,0). Then AB times this v is the first 
column of AB. And Bv is the first column of B. So column 1 of AB equals A times column 
1 of B. This was the AB rule from the start. Every other column of AB goes the same way, 
by moving the “1” in v. 

Thus (AB)v = A(Bv). With several v’s in a matrix C,, this becomes (AB)C = A(BC). 


Elimination factors A into LU = (lower triangular) times (upper triangular). 


The MATLAB command [L, U] = lu(A) will output L and U, unless there are row 
exchanges. L and U are a complete record of elimination on the left side of Av = Db. 
The solution v comes from the right side b by solving the two triangular systems : 


From b toc Ph From c to v npr 
Forward substitution ae Back substitution _ 


Then v is the correct solution: Av = LUv = Lc = b. The forward substitution is what 
happened to 6 as elimination went forward on[A_ bj. 

Second difference matrices have beautiful inverses and LU factors if the first diagonal entry 
is 1 instead of 2. Here is the 3 by 3 tridiagonal matrix T' and its inverse: 


i =. 6 32 1 
Ty =1 T=| =. 2 <1 T'!=|}2 2 1 
0 =. 3 ae Oat 


One approach is Gauss-Jordan elimination on [T I]. That seems too mechanical. 
I would rather write T' using first differences L and U. The inverses are sum matrices 
Le and br: 


1 i ee=a | 0 L 2d 1 
Fly oA Po=1 T= ce Os a | 
0 -1 1 1 1 i tf 4 
difference difference sum sum 


Question. (4 by 4) What are the pivots of T'? What is its 4 by 4 inverse ? 
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Chapter 5 


Vector Spaces and Subspaces 


5.1 The Column Space of a Matrix 


To a newcomer, matrix calculations involve a lot of numbers. To you, they involve vectors. 
The columns of Av and AB are linear combinations of n vectors—the columns of A. This 
chapter moves from numbers and vectors to a third level of understanding (the highest level). 
Instead of individual columns, we look at “spaces” of vectors. Without seeing 
vector spaces and their subspaces, you haven’t understood everything about Av = b. 

Since this chapter goes a little deeper, it may seem a little harder. That is natural. We are 
looking inside the calculations, to find the mathematics. The author’s job is to make it clear. 
Section 5.5 will present the “Fundamental Theorem of Linear Algebra.” 

We begin with the most important vector spaces. They are denoted by R!, R”, R?, R4, 
.... Each space R” consists of a whole collection of vectors. R° contains all column vectors 
with five components. This is called “‘5-dimensional space.” 


DEFINITION The space R” consists of all column vectors v with n components. 


The components of v are real numbers, which is the reason for the letter R. When the 
n components are complex numbers, v lies in the space C”. 


The vector space R? is represented by the usual zy plane. Each vector v in R? has two 
components. The word “space” asks us to think of all those vectors—the whole plane. Each 
vector gives the x and y coordinates of a point in the plane: v = (z, y). 

Similarly the vectors in R® correspond to points (zx, y, z) in three-dimensional space. 
The one-dimensional space R! is a line (like the x axis). As before, we print vectors as a 
column between brackets, or along a line using commas and parentheses: 


en eine 


|] isinR?, (1,1,0,1,1)isin R®, | 
7 1-1 


The great thing about linear algebra is that it deals easily with five-dimensional space. 
We don’t draw the vectors, we just need the five numbers (or n numbers). 
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To multiply v by 7, multiply every component by 7. Here 7 is a “scalar.” To add vectors 
in R°, add them a component at a time: five additions. The two essential vector operations 
go on inside the vector space, and they produce linear combinations : 


We can add any vectors in R”, and we can multiply any vector v by any scalar c. 


“Inside the vector space” means that the result stays in the space : This is crucial. 

If v is in R* with components 1,0,0,1, then 2v is the vector in R* with components 
2,0,0,2. (In this case 2 is the scalar.) A whole series of properties can be verified in R”. 
The commutative law is v + w = w + v; the distributive law is c(v + w) = cv + cw. 
Every vector space has a unique “zero vector” satisfying 0 + v = v. Those are three of the 
eight conditions listed in the Chapter 5 Notes. 

These eight conditions are required of every vector space. There are vectors other than 
column vectors, and there are vector spaces other than R”. All vector spaces have to obey 
the eight reasonable rules. 

A real vector space is a set of “vectors” together with rules for vector addition and 
multiplication by real numbers. The addition and the multiplication must produce vectors 
that are in the space. And the eight conditions must be satisfied (which is usually no 
problem). You need to see three vector spaces other than R”: 


M The vector space of all real 2 by 2 matrices. 
m6 The vector space of all solutions y(t) to Ay” + By’ + Cy = 0. 
Z The vector space that consists only of a zero vector. 


In M the “vectors” are really matrices. In Y the vectors are functions of t, like y = e**. 
In Z the only addition is 0 + 0 = O. In each space we can add: matrices to matrices, 
functions to functions, zero vector to zero vector. We can multiply a matrix by 4 or 
a function by 4 or the zero vector by 4. The result is still in M or Y or Z. 

The space R4 is four-dimensional, and so is the space M of 2 by 2 matrices. Vectors 
in those spaces are determined by four numbers. The solution space Y is two-dimensional, 
because second order differential equations have two independent solutions. Section 5.4 will 
pin down those key words, independence of vectors and dimension of a space. 


The space Z is zero-dimensional (by any reasonable definition of dimension). It is the 
smallest possible vector space. We hesitate to call it R°, which means no components—you 
might think there was no vector. The vector space Z contains exactly one vector. 
No space can do without that zero vector. Each space has its own zero vector—the 
zero matrix, the zero function, the vector (0, 0,0) in R?. 


Subspaces 


At different times, we will ask you to think of matrices and functions as vectors. But at all 
times, the vectors that we need most are ordinary column vectors. They are vectors with 
nm components—but maybe not all of the vectors with n components. There are important 
vector spaces inside R”. Those are subspaces of R”. 
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fi i = typical vector in M 


bo 
00 
Figure 5.1: “4-dimensional” matrix space M. 3 subspaces of R3: plane P, line L, point Z. 


Start with the usual three-dimensional space R°. Choose a plane through the origin 
(0,0,0). That plane is a vector space in its own right. If we add two vectors in the plane, 
their sum is in the plane. If we multiply an in-plane vector by 2 or —5, it is still in the plane. 
A plane in three-dimensional space is not R? (even if it looks like R?). The vectors have 
three components and they belong to R°. The plane P is a vector space inside R°. 

This illustrates one of the most fundamental ideas in linear algebra. The plane going 
through (0,0, 0) is a subspace of the full vector space R°. 


DEFINITION A subspace of a vector space is a set of vectors (including O) that satisfies 
two requirements: Jf v and w are vectors in the subspace and c is any scalar, then 


(i) vu + w isin the subspace and (ii) cv is in the subspace. 


In other words, the set of vectors is “closed” under addition v + w and multiplication cv 
(and dw). Those operations leave us in the subspace. We can also subtract, because —w is 
in the subspace and its sum with v is v — w. In short, all linear combinations cv + dw stay 
in the subspace. 

First fact : Every subspace contains the zero vector. The plane in R? has to go through 
(0,0, 0). We mention this separately, for extra emphasis, but it follows directly from rule (ii). 
Choose c = 0, and the rule requires Ov to be in the subspace. 

Planes that don’t contain the origin fail those tests. When v is on such a plane, —v and 0v 
are not on the plane. A plane that misses the origin is not a subspace. 

Lines through the origin are also subspaces. When we multiply by 5, or add two vectors 
on the line, we stay on the line. But the line must go through (0, 0, 0). 

Another subspace is all of R°. The whole space is a subspace (of itself). That is a fourth 
subspace in the figure. Here is a list of all the possible subspaces of R°: 


(L) Any line through (0, 0, 0) (R°) The whole space 
(P) Any plane through (0, 0, 0) (Z) The single vector (0, 0,0) 
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If we try to keep only part of a plane or line, the requirements for a subspace don’t hold. 
Look at these examples in R?. 


Example 1 Keep only the vectors (x, y) whose components are positive or zero (this is 
a quarter-plane). The vector (2,3) is included but (—2, —3) is not. So rule (ii) is violated 
when we try to multiply by c = —1. The quarter-plane is not a subspace. 


Example 2 Include also the vectors whose components are both negative. Now we have 
two quarter-planes. Requirement (ii) is satisfied; we can multiply by any c. But rule (i) now 
fails. The sum of v = (2,3) and w = (—3,—2) is (—1,1), which is outside the quarter- 
planes. Two quarter-planes don’t make a subspace. 

Rules (i) and (ii) involve vector addition v + w and multiplication by scalars like c 
and d. The rules can be combined into a single requirement—the rule for subspaces : 


A subspace containing v and w must contain all linear combinations cv + dw. 


Example 3__ Inside the vector space M of all 2 by 2 matrices, here are two subspaces : 


b 


(U) All upper triangular matrices : d (D) All diagonal matrices ; e : 


Add any two matrices in U, and the sum is in U. Add diagonal matrices, and the sum is 
diagonal. In this case D is also a subspace of U! The zero matrix alone is also a subspace, 
when a, b, and d all equal zero. 
For a smaller subspace of diagonal matrices, we could require a = d. The matrices are 
multiples of the identity matrix J. These al form a “line of matrices” in M and U and D. 
Is the matrix J a subspace by itself ? Certainly not. Only the zero matrix is. Your mind 
will invent more subspaces of 2 by 2 matrices—write them down for Problem 6. 


The Column Space of A 


The most important subspaces are tied directly to a matrix A. We are trying to solve 
Av = b. If A is not invertible, the system is solvable for some 6 and not solvable for 
other b. We want to describe the good right sides b—the vectors that can be written as A 
times v. Those b’s form the “column space” of A. 

Remember that Av is a combination of the columns of A. To get every possible b, we 
use every possible v. Start with the columns of A, and take all their linear combinations. 
This produces the column space of A. It contains not just the n columns of A! 


DEFINITION The column space consists of all combinations of the columns. 


The combinations are all possible vectors Av. They fill the column space C'(A). 

This column space is crucial to the whole book, and here is why. To solve Av = bis to 
express b as a combination of the columns. The right side b has to be in the column space 
produced by A on the left side. If b is not in C(A), Av = b has no solution. 
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The system Av = bis solvable if and only if b is in the column space of A. 


When b is in the column space, it is a combination of the columns. The coefficients in that 
combination give us a solution v to the system Av = b. 

Suppose A is an m by n matrix. Its columns have m components (not n). So the 
columns belong to R™. The column space of A is a subspace of R™ (not R™ ). The set 
of all column combinations Az satisfies rules (i) and (ii) for a subspace: When we add 
linear combinations or multiply by scalars, we still produce combinations of the columns. 
The word “subspace” is always justified by taking all linear combinations. 


Here is a 3 by 2 matrix A, whose column space is a subspace of R°. The column space 
of A is a plane in Figure 5.2. 


1 O 
0 A= |4 3 
2 38 

1 0 

b=v, 14] +22 13 

2 3 


Plane = C(A) =all vectors Av 


Figure 5.2: The column space C(A) is a plane containing the two columns of A. 
Av = bis solvable when b is on that plane. Then b is a combination of the columns. 


We drew one particular b (a combination of the columns). This 6 = Av lies on the plane. 
The plane has zero thickness, so most right sides b in R? are not in the column space. 
For most 6 there is no solution to our 3 equations in 2 unknowns. 

Of course (0, 0, 0) is in the column space. The plane passes through the origin. There is 
certainly a solution to Av = O. That solution, always available, is v = 


To repeat, the attainable right sides b are exactly the vectors in the column space. One 
possibility is the first column itself—take v; = 1 and v2 = 0. Another combination is the 
second column—take v; = 0 and v2 = 1. The new level of understanding is to see all 
combinations—the whole subspace is generated by those two columns. 


Notation The column space of A is denoted by CA). Start with the columns and take all 
their linear combinations. We might get the whole R™ or only a small subspace. 
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Important Instead of columns in R™, we could start with any set of vectors in a vector 
space V. To get a subspace SS of V, we take all combinations of the vectors in that set : 


S = _ set of vectors sin V (S is probably not a subspace) 
SS = _ all combinations of vectors in S (SS is a subspace) 
SS = allcysy+---+cysn = thesubspace of V “spanned” by S 


When S is the set of columns, SS is the column space. When there is only one nonzero 
vector v in S, the subspace SS is the line through v. Always SS is the smallest subspace 
containing S. This is a fundamental way to create subspaces and we will come back to it. 


The subspace SS is the “span” of S, containing all combinations of vectors in S. 


Example 4 __ Describe the column spaces (they are subspaces of R*) for these matrices : 


‘]hani() 142 E38 
bala ‘A and A=[5 | and B=| 0 at 


Solution The column space of J is the whole space R*. Every vector is a combination of 
the columns of J. In vector space language, C(I) equals R?. 

The column space of A is only a line. The second column (2, 4) is a multiple of the first 
column (1, 2). Those vectors are different, but our eye is on vector spaces. The column space 
contains (1, 2) and (2,4) and all other vectors (c, 2c) along that line. The equation Av = b 
is only solvable when b is on the line. 

For the third matrix (with three columns) the column space C(B) is all of R?. Every b 
is attainable. The vector b = (5,4) is column 2 plus column 3, so v can be (0,1, 1). The 
same vector (5,4) is also 2(column 1) + column 3, so another possible v is (2,0, 1). This 
matrix has the same column space as J—any b is allowed. But now v has extra components 
and Av = b has more solutions—more combinations that give b. 


The next section creates the nullspace N(A), to describe all the solutions of Av = 0. 
This section created the column space C'(A), to describe all the attainable right sides b. 


= REVIEW OF THE KEYIDEAS #8 


. R” contains all column vectors with n real components. 
. M (2 by 2 matrices) and Y (functions) and Z (zero vector alone) are vector spaces. 


. A subspace containing v and w must contain all their combinations cv + dw. 


> oO NHN 


. The combinations of the columns of A form the column space CA). Then the column 
space is “spanned” by the columns. 


5. Av = bhas a solution exactly when 6 is in the column space of A. 
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= WORKED EXAMPLES #8 


5.1 A We are given three different vectors b;,b2,63. Construct a matrix so that the 
equations Av = b; and Av = 62 are solvable, but Av = bg is not solvable. How can you 
decide if this is possible ? How could you construct A? 


Solution We want to have b; and be in the column space of A. Then Av = by, and 
Av = by will be solvable. The quickest way is to make 6b; and bz the two columns of A. 
Then the solutions are v = (1,0) and v = (0,1). 

Also, we don’t want Av = bs to be solvable. So don’t make the column space any 
larger! Keeping only the columns b, and be, the question is: Do we already have b3 ? 


Is Av = bi by | 5 | = b3 solvable? Is b3 a combination of b; and bz ? 
2 


If the answer is no, we have the desired matrix A. If bg is a combination of b; and bo, 
then it is not possible to construct A. The column space C'(A) will have to contain bs. 


5.1B Describe a subspace S of each vector space V, and then a subspace SS of S. 


V3 = all combinations of (1,1,0,0) and (1,1, 1,0) and (1, 1,1, 1) 
V2 = all vectors v perpendicular to wu = (1,2,1), sou-v =0 
V. = all solutions y(z) to the equation d*y/dz* = 0 


Describe each V two ways: (1) All combinations of .... (2) All solutions of .... 


Solution V3 starts with three vectors. A subspace S comes from all combinations of the 
first two vectors (1, 1,0,0) and (1,1,1,0). A subspace SS of S comes from all multiples 
(c, c, 0, 0) of the first vector. So many possibilities. 

A subspace S of V2 is the line through (1,—1,1). This line is perpendicular to w. 
The zero vector z = (0,0, 0) is in S. The smallest subspace SS is Z. 

Vz contains all cubic polynomials y = a + br + cx? + dx?, with d+y/dr* = 0. The 
quadratic polynomials (without an 2° term) give a subspace S. The linear polynomials 
are one choice of SS. The constants y = a could be SSS. 

In all three parts we could take S = V itself, and SS = the zero subspace Z. 

Each V can be described as all combinations of .... and as all solutions of ....: 


V3 = all combinations of the 3 vectors V3 = all solutions of vy; — v2 =0. 
V2 = all combinations of (1,0, —1) and (1,—1,1) V2 =allsolutionsofu-v =O. 
V4 = all combinations of 1, x, x7, x? V4, = all solutions to d+y/dx* = 0. 
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Problem Set 5.1 


Questions 1-10 are about the “subspace requirements”: v + w and cv (and then all 
linear combinations cv + dw) stay in the subspace. 


1 One requirement can be met while the other fails. Show this by finding 


(a) A set of vectors in R? for which v + w stays in the set but su may be outside. 
(b) A set of vectors in R? (other than two quarter-planes) for which every cv stays 
in the set but v + w may be outside. 


2 Which of the following subsets of R° are actually subspaces ? 


(a) The plane of vectors (b;, b2, bz) with b) = bo. 

(b) The plane of vectors with b; = 1. 

(c) The vectors with bj b2b3 = 0. 

(d) All linear combinations of v = (1, 4,0) and w = (2, 2, 2). 
(e) All vectors that satisfy b; + b2 + b3 = 0. 

(f) All vectors with b} < bo < bg. 


3 Describe the smallest subspace of the matrix space M_ that contains 


1 0 0 il L 1. 0 1 O 
@]5 ome o| w | 4 a lo ofml[o | 
4 Let P be the plane in R? with equation x + y — 2z = 4. The origin (0,0, 0) is not in 
P ! Find two vectors in P and check that their sum is not in P. 


5 Let Po be the plane through (0,0, 0) parallel to the previous plane P. What is the 
equation for Po ? Find two vectors in Po and check that their sum is in Po. 


6 The subspaces of R® are planes, lines, R° itself, or Z containing only (0, 0,0). 


(a) Describe the three types of subspaces of R?. 
(b) Describe all subspaces of D, the space of 2 by 2 diagonal matrices. 


7 (a) The intersection of two planes through (0, 0,0) is probably a but it could 
bea . Itcan’t be Z! 


(b) The intersection of a plane through (0,0,0) with a line through (0,0,0) is 
probably a but it could be a : 


(c) If S and T are subspaces of R°, prove that their intersection SM T is a 
subspace of R®. Here SM T consists of the vectors that lie in both subspaces. 
Check the requirements on v + w and cv. 


8 Suppose P is a plane through (0,0, 0) and L is a line through (0, 0,0). The smallest 
vector space P + L containing both P and L is either or 
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9 


10 


(a) Show that the set of invertible matrices in M is not a subspace. 


(b) Show that the set of singular matrices in M is not a subspace. 
True or false (check addition in each case by an example) : 


(a) The symmetric matrices in M (with AT = A) forma subspace. 
(b) The skew-symmetric matrices in M (with AT = —A) form a subspace. 


(c) The unsymmetric matrices in M (with AT 4 A) form a subspace. 


Questions 11-19 are about column spaces C(A) and the equation Av = b. 


11 


12 


13 


14 


15 


16 


Describe the column spaces (lines or planes) of these particular matrices : 


1 2 1 0 1 0 
A=]|0 0 B= | 20! 2 C=] 2.0 
0 0 0 0 0 0 


For which right sides (find a condition on bj, bg, bs) are these systems solvable ? 


1 4 387 Tw by ee, by 
(a) 2 8 4 v2 = bg (b) 2 9 - = bg 
=] <4 =2] [ae bs a ee bs 


Adding row 1 of A to row 2 produces B. Adding column 1 to column 2 produces C’. 
Which matrices have the same column space ? Which have the same row space ? 


bbe eg Ie 24 
A=[3 ‘| and B=| 5 A and ae al 


For which vectors (01, 62, b3) do these systems have a solution ? 


1 1 Ly by 1 1 1 vy by 
l rq} = | be and 0 1 1 2} = | be 
0 0 1 v3 bg 0 0 0 U3 bs 
lL & 4 Ly by 
and 0 0 1 LQ = bo 
0 0 1 r3 bg 


(Recommended) If we add an extra column b to a matrix A, then the column space gets 
larger unless . Give an example where the column space gets larger 
and an example where it doesn’t. Why is Av = b solvable exactly when the 
column space doesn’t get larger ? Then it is the same for A and [A }]. 


The columns of AB are combinations of the columns of A. This means: The 
column space of AB is contained in (possibly equal to) the column space of A. 
Give an example where the column spaces of A and AB are not equal. 
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17 


18 
19 


20 


21 


22 


23 


24 


25 
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Suppose Av = b and Aw = D* are both solvable. Then Az = b + b* is solvable. 
What is z? This translates into: If b and b* are in the column space CA), then 
b + b* is also in C(A). 


If A is any 5 by 5 invertible matrix, then its column space is . Why ? 


True or false (with a counterexample if false) : 


(a) The vectors b that are not in the column space C'(A) form a subspace. 
(b) If C(A) contains only the zero vector, then A is the zero matrix. 

(c) The column space of 2A equals the column space of A. 

(d) The column space of A — J equals the column space of A (test this). 


Construct a 3 by 3 matrix whose column space contains (1, 1,0) and (1,0, 1) but not 
(1,1, 1). Construct a 3 by 3 matrix whose column space is only a line. 


If the 9 by 12 system Av = bis solvable for every b, then C(A) must be 
Challenge Problems 


Suppose S and T are two subspaces of a vector space V. The sum S + T contains all 
sums s + t of a vector s in S anda vector tin T. Then S + T is a vector space. 


If S and T are lines in R™, what is the difference between S + T andS UT? 
That union contains all vectors from S and all vectors from T. Explain this state- 
ment: The spanof SUT isS+T. 


If S is the column space of A and T is C(B), then S + T is the column space of 
what matrix 14? The columns of A and B and M are all in R™. (I don’t think 
A+ Bis always a correct M.) 


Show that the matrices A and [A AB] (this has extra columns) have the same 
column space. But find a square matrix with C(A”) smaller than C(A). 


Ann by n matrix has C(A) = R” exactly when Aisan __ matrix. 
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5.2 The Nullspace of A: Solving Av = 0 


This section is about the subspace containing all solutions to Av = 0. The m by n matrix A 
can be square or rectangular. One immediate solution is v = 0. For invertible matrices this is 
the only solution. For other matrices, not invertible, there are nonzero solutions to Av = 0. 
Each solution v belongs to the nullspace of N(A). 

Elimination will find all solutions and identify this very important subspace. 


The nullspace of A consists of all solutions to Av = 0. These vectors v are in R”. 


Check that the solution vectors form a subspace. Suppose v and w are in the nullspace, 
so that Av = 0 and Aw = 0. The rules of matrix multiplication give A(v + w) = 04 0. 
The rules also give A(cv) = cO. The right sides are still zero. Therefore v + w and cv are 
also in the nullspace (A). Since we can add and multiply without leaving the nullspace, it 
is a subspace. 

The solution vectors v have n components. They are vectors in R”, so the nullspace 
N(A) is a subspace of R". The column space C'(A) is a subspace of R™. 

If the right side b is not zero, the solutions of Av = b do not form a subspace. The 
vector v = 0 is only a solution if b = 0. When the set of solutions does not include v = 0, 
it cannot be a subspace. Section 5.3 will show how the solutions to Av = b (if there are any 
solutions) are shifted away from the origin by one particular solution vp. 


Example 1 «+ 2y + 3z = O comes from the 1 by 3 matrix A = [1 2 3]. 
This equation Av = O produces a plane through the origin (0,0,0). The plane is a 
subspace of R®, and it is the nullspace of A. 

The solutions to z + 2y + 3z = 6 also form a plane, but not a subspace. 


Example 2 Describe the nullspace of A = : ; | This matrix is singular ! 
Solution Apply elimination to the linear equations Av = 0: 
v1 + 2vq =0 V1 + 2ve = 0 
3U1 + 6v2 = 0 o=0 


There is really only one equation. The second equation is the first equation multiplied 
by 3. In the row picture, the line v; + 2vg = 0 is the same as the line 3v; + 6vg = 0. 
That line is the nullspace N(A). It contains all solutions v = (v1, v2). 


To describe this line of solutions, here is an efficient way. Choose one point on the line 
(one “special solution’’). Then all points on the line are multiples of this one. We choose the 
second component to be v2 = 1 (a special choice). From the equation v; + 2v2 = 0, the first 
component must be v] = —2. The special solution s is (—2, 1): 


Special 


ey : , —2 
séhicine The nullspace of A = 3 | contains all multiples of s = | : 


1 
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This is the best way to describe the nullspace, by computing special solutions to Av = 0. 
The nullspace consists of all combinations of the special solutions. 


The plane x + 2y + 3z = 0 in Example 1 had two special solutions: 


x —2 =3 
[ 122 EA y | = 0 has the special solutions s; = 1 | and sg = 0 
z 0 1 


Those vectors s; and Sp lie on the plane x + 2y + 3z = O, which is the nullspace of 
A= [ [e253 ls All vectors on the plane are combinations of s; and so. 

Notice what is special about s; and sg. They have ones and zeros in the last two 
components. Those components are “free” and we choose them specially as 1 and 0. 
Then the first components —2 and —3 are determined by the equation Av = 0. 

The first column of A = [1 2, 3] contains the pivot, so the first component vj is not 
free. The free components correspond to columns without pivots. This description of special 
solutions will be completed after one more example. 

The special choice (one or zero) is only for the free variables in the special solutions. 


Example 3 Describe the nullspaces N(A), N(B), N(C) of these three matrices : 


1 2 
ee A 3 8 yD 
a=|5 | B-|,4|- e gl oo 24] =| 5 8 6 me 
6 16 


Solution The equation Av = 0 has only the zero solution v = 0. The nullspace is Z. 
It contains only the single point v = 0 in R*. This comes from elimination : 


1 2 V1 0 ‘ LZ Vi 0 v1 = 0 
[3 s][e]-[o}m[o 2] [}=[o] [azo], 
A is invertible. There are no special solutions. All columns of this A have pivots. 

The rectangular matrix B has the same nullspace Z. The first two equations in Bu = 0 
again require v = O. The last two equations would also force v = 0. When we add 
extra equations, the nullspace certainly cannot become larger. The extra rows impose more 
conditions on the vectors v in the nullspace. 

The rectangular matrix C is different. It has extra columns instead of extra rows. The 


solution vector v has four components. Elimination will produce pivots in the first two 
columns of C, but the last two columns are “free”. They don’t have pivots: 


2 pivot columns on tT 2° '2; 4 b ine Wo 22 Qe 
2 free columns ‘gms ca me sca S| aes a | | 
(eick[ aaa feeal 


pivot columns _ free columns 


For the free variables v3 and v4, we make special choices of ones and zeros. First v3 = 1, 
v4 = 0 and second v3 = 0, v4 = 1. Then the pivot variables v; and v2 are determined. 
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Solve Uv = 0 to get two special solutions in the nullspace of C' (and U). 


—2 0 + pivot 
Special solutions 0 —2 <__ variables 
8; and s2 $1 = 1 and ta = 0 + free 
1 <__ variables 


One more comment to anticipate what is coming soon. Elimination will not stop at the 
upper triangular U ! We can continue to make this matrix simpler, in two ways: 


1. Produce zeros above the pivots. Eliminate upward. 


2. Produce ones in the pivots. Divide the whole row by its pivot. 


Those steps don’t change the zero vector on the right side of the equation. The nullspace 
stays the same. This nullspace becomes easiest to see when we reach the reduced row 
echelon form R. It has I in the pivot columns, when row 2 is divided by 2: 


Reduced U P25 2a b me LF 60) 52040, 
form R tis 20 Nel Sore aera, Magee @ Ges 
+ 
Now the pivot columns contain [ 


I subtracted row 2 of U from row 1, and then multiplied row 2 by 5. The original two 
equations have simplified to x; + 273 = O and x2 + 274 = 0. 

The first special solution is still s; = (—2,0, 1,0). All special solutions are unchanged. 
Special solutions are much easier to find from the reduced system Rv = 0. 

Before moving to m by n matrices A and their nullspaces N(A) and special solutions, 
allow me to repeat one comment. For many matrices, the only solution to Av = O is v = 
0. Their nullspaces N(A) = Z contain only that zero vector. The only combination of 
the columns that produces b = O is then the “zero combination” or “trivial combination”. 
The solution is trivial (just v = 0) but the idea is not trivial. 

This case of a zero nullspace Z is of the greatest importance. It says that the columns 
of A are independent. No combination of columns gives the zero vector (except the zero 
combination). All columns have pivots, and no columns are free. You will see this idea of 
independence again... 


Solving Av = 0 by Elimination 


This is important. A is rectangular and we still use elimination. We solve m equations 
in n unknowns. After A is simplified to U or to R, we read off the solution (or solutions). 
Remember the two stages (forward and back) in solving Av = 0: 
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1. Elimination takes A to a triangular U (or its reduced form R). 
2. Back substitution in Uv = 0 or Rv = 0 produces v. 


You will notice a difference in back substitution, when A and U have fewer than n pivots. 
We are allowing all matrices in this chapter, not just the nice ones (which are square matrices 
with inverses). 

Pivots are still nonzero. The columns below the pivots are still zero. But it might 
happen that a column has no pivot. That free column doesn’t stop the calculation. Go on 
to the next column. The first example is a 3 by 4 matrix with two pivots: 


cli ee 3 
Elimination on AS | 2248 a0 
$ 3 10 13 


Certainly a1; = 1 is the first pivot. Clear out the 2 and 3 below that pivot: 


lel 2033 
Sa loge age A (subtract 2 x row 1) 
Or Gree (subtract 3 x row 1) 


The second column has a zero in the pivot position. We look below the zero for a nonzero 
entry, ready to do a row exchange. The entry below that position is also zero. Elimination 
can do nothing with the second column. This signals trouble, which we expect anyway for a 
rectangular matrix. There is no reason to quit, and we go on to the third column. 

The second pivot is 4 (but it is in the third column). Subtracting row 2 from row 3 
clears out that third column below the pivot. The pivot columns are | and 3: 


112 3 Only two pivots 
TriangularU U=)]0 0 4 4 The last equation 
0 00 O became 0 = 0 


The fourth column also has a zero in the pivot position—but nothing can be done. There is no 
row below it to exchange, and forward elimination is complete. The matrix has three rows, 
four columns, and only two pivots. The third equation in Av = 0 is the sum of the first two. 
It is automatically satisfied (0 = 0) when the first two equations are satisfied. Elimination 
reveals the inner truth about Av = 0. Soon we push on from U to R. 

Now comes back substitution, to find all solutions to Uv = 0. With four unknowns and 
only two pivots, there are many solutions. The question is how to write them all down. A 
good method is to separate the pivot variables from the free variables. 


P The pivot variables are v1 and v3. Columns 1 and 3 contain pivots. 


F The free variables are v2 and v4. | Columns 2 and 4 have no pivots. 
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The free variables v2 and v4 can be given any values whatsoever. Then back substitution finds 
the pivot variables v; and v3. (In Chapter 2 no variables were free. When A is invertible, all 
variables are pivot variables.) The simplest choices for the free variables are ones and zeros. 
Those choices give the special solutions. 


Special solutions to v1 + vo + 2v3 + 3v4 = 0 and 4u3 + 4v4 = 0 


¢ Set vg = 1landv4 =0. By back substitution v3 = 0. Then v; = — 1. 


e Set ve =Oandv4=1. By back substitution v3 = — 1.Thenv; = — 1. 


These special solutions solve Uv = 0 and therefore Av = O. They are in the nullspace. The 
good thing is that every solution is a combination of the special solutions. 


—1 —1 —U2 — U4 
Complete solution — 1 0 = V2 
to Av =0 ~ ie 0 rus —] oa —U4 : (1) 
0 1 V4 
special special complete 


Please look again at that answer. It is the main goal of this section. The vector s; = 
(— 1,1, 0,0) is the special solution when vg = 1 and v4 = 0. The second special solu- 
tion has vo = O and vg = 1. All solutions are linear combinations of s, and sj. The 
special solutions are in the nullspace N(A), and their combinations fill the whole nullspace. 

There is a special solution for each free variable. If no variables are free—this means all 
n columns have pivots—then the only solution to Uv = 0 and Av = O is the trivial solution 
v = 0. With no free variables, the nullspace is Z. 


Example 4 Find the nullspace of U = i : ‘| : 
The second column of U has no pivot. So v2 is free. The special solution has vg = 1. Back 


substitution into 9v3 = 0 gives v3 = 0. Then v; + 5v2g = 0 or v; = — 5. The solutions to 
Uv = Oare multiples of one special solution sj : 


—5 The nullspace of U is a line in R°. 
g=e) 1 It contains multiples of the special solution s; = (— 5,1, 0). 
0 One variable is free. 


The matrix R has zeros above and below the pivots, and ones in the pivots. 
By continuing elimination on U, the 7 is removed and the pivot changes from 9 to 1. The 
final result will be the reduced row echelon form R: 


i lr Siete o2 Bol 
ele 0 5 | reduces to R= | 4 0 1 | = Hetw), 
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Echelon Matrices 


Forward elimination goes from A to U. It acts by row operations, including row exchanges. 
It goes on to the next column when no pivot is available in the current column. The m by n 
“staircase” U is an echelon matrix. 

Here is a 4 by 7 echelon matrix with the three pivots p highlighted in boldface: 


PR LE wero Ie Three pivot variables U1, U2, U6 
U= OG per 205 Bae a? oe Four free variables UZ_ V4y U5, U7 

O 0.0 0.0 “pi 2 Four special solutions in N(U) 

OH 0. <0" 0° 30’ 30240 FR will have p = 1 and bold az = 0 


Question What are the column space and the nullspace for this matrix ? 


Answer The columns have four components so they lie in R*. (Not in R®!) The fourth 
component of every column is zero. The column space C(U) consists of all vectors of the 
form (bj, bz, b3,0). For those vectors we can solve Uv = 6 by back substitution. These 
vectors 6 are all possible combinations of the seven columns. 


The nullspace N(U) is a subspace of R’. The solutions to Uv = 0 are all the combi- 
nations of the four special solutions—one for each free variable : 


1. Columns 3, 4, 5, 7 have no pivots. The free variables are v3, v4, U5, U7. 
2. Set one free variable to 1 and set the other free variables to zero. 


3. Solve Uv = 0 for the pivot variables v,, v2, vg to get a special solution. 


The nonzero rows of an echelon matrix go down in a staircase pattern. The pivots are the 
first nonzero entries in those rows. There is a column of zeros below every pivot. 


The Counting Theorem 


Counting the pivots leads to an extremely important theorem. Suppose A has more columns 
than rows. With n > m there is at least one free variable. The system Av = 0 has at least 
one special solution. This solution is not zero ! 


Suppose Av = 0 has more unknowns than equations (n > m, more columns than rows). 
Then there are nonzero solutions in (A). There must be free columns, without pivots. 


A short wide matrix (n > m) always has nonzero vectors in its nullspace. There must be at 
least n — m free variables, since the number of pivots cannot exceed m. (The matrix only 
has ™ rows, and a row never has two pivots.) Of course a row might have no pivot—which 
means an extra free variable. But here is the point: When there is a free variable, it can be 
set to 1. Then the equation Av = 0 has a nonzero solution. 
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To repeat: There are at most m pivots. With n > m, the system Av = 0 has a nonzero 
solution. Actually there are infinitely many solutions, since any multiple cv is also a solution. 
The nullspace contains at least a line of solutions. With two free variables, there will be two 
special solutions and the nullspace will be even larger. 


The nullspace is a subspace. Its “dimension” is the number of special solutions. 
This central idea—the dimension of a subspace—is defined and explained in this chapter. 


Dimension of C(A) = rank of matrix = number of pivot columns 
Dimension of N(A) = nullity of matrix = number of free columns. 
Counting Theorem with n columns Rank r plus nullity n — r equals n. 


The Reduced Row Echelon Matrix R 


From an echelon matrix U we go one more step. Continue with a 3 by 4 example: 


Le 2233 
U=|0 0 4 4 
DOr AOR £0 


We can divide the second row by 4. Then both pivots equal 1. We can subtract 2 times this 
new row [0 0 1 1] from the row above. The reduced row echelon matrix R has zeros 
above the pivots as well as below: 


Pivot rows 
contain I 


Reduced row 
echelon matrix 


FR has 1’s as pivots. Zeros above pivots come from upward elimination. 


Important /f A is invertible, its reduced row echelon form is the identity matrix R = I. 
This is the ultimate in row reduction. Of course the nullspace is then Z. 
The zeros in R make it easy to find the special solutions (the same as before) : 


1. Set vo = 1 and v4 = 0. Solve Rv = O. Then v; = —1 and v3 = 0. 
Those numbers —1 and 0 are sitting in column 2 of R (with plus signs). 

2. Set vo = 0 and v4 = 1. Solve Rv = O. Then v; = —1 and v3 = —1. 
Those numbers —1 and —1 are sitting in column 4 (with plus signs). 


By reversing signs we can read off the special solutions directly from R. The nullspace 
N(A) = N(U) = N(R) contains all combinations of the special solutions : 


= —l 
1 0 : 
OSU | ze (ee pa (complete solution of Av = 0). 
0 1 


The next section of the book moves firmly from U to the row reduced form Rf. The 
MATLAB command | R, pivcol] = rref(A) produces R and a list of the pivot columns. 
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= REVIEW OF THE KEYIDEAS #& 


1. The nullspace N(A) is a subspace of R”. It contains all solutions to Av = 0. 


2. Elimination produces an echelon matrix U, and then a row reduced R (pivots = 1). 


3. Every free column of U or R leads to a special solution. The free variable equals 1 
and the other free variables equal 0. Back substitution solves Av = 0. 


4. The complete solution to Av = 0 is a combination of the special solutions. 
5. Ahas at least one free column and one special solution if n > m: N(A) is not Z. 


6. The count of pivot columns and free columns is r + (n —r) =n. 


= WORKED EXAMPLES #8 


3.2 A Create a 3 by 4 matrix R whose special solutions to Rv = 0 are 8; and s2: 


—3 —2 
i 0 pivot columns 1 and 3 
81; = and s9= : 
0 —6 free variables v2 and v4 
0 1 


Describe all matrices A with this nullspace N(A) = combinations of s; and s2. 


Solution The reduced matrix R has pivots = 1 in columns | and 3. There is no third 
pivot, so the third row of R is all zeros. The free columns 2 and 4 will be combinations of 
the pivot columns: 


Lb. 3 2Or 22 
Ao= 80.20) 418 8G has Rs;=O and Rs. =0. 
OFHO020 LEG 


The entries 3, 2,6 in R are the negatives of —3, —2, —6 in the special solutions ! 

R is only one matrix (one possible A) with the required nullspace. We could do any 
elementary operations on R—exchange rows, multiply a row by any c # 0, subtract any 
multiple of one row from another. AR can be multiplied (on the left) by any invertible 
matrix, without changing its nullspace. 


Every 3 by 4 matrix has at least one special solution. These matrices have two. 
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3.2B Find the special solutions and the complete solutions to Av = 0 and Agu = 0: 
3 6 By O83? 40 
A=|7 3] A=(A A]=[7 > eae 


Which are the pivot columns ? Which are the free variables ? What is R in each case ? 


Solution Av = O has one special solution s = (—2,1). The line of all cs is the 
complete solution. The first column of A is its pivot column, and v2 is the free variable: 


3h 6 bee 32 1 2 ag 2 
a=($ n> R=(6 s | [A Al>m=|5 50 0| 
Notice that Rz has only one pivot column (the first column). All the variables v2, v3, v4 
are free. There are three special solutions to Az v = O (and also Rov = O): 
8; =(—2,1,0,0) s2=(—1,0,1,0) s3=(—2,0,0,1) Complete v=c;s; + co82 + ¢383. 


With r pivots, A has n — r free variables and Av = 0 has n — r special solutions. 


Problem Set 5.2 


Questions 1-4 and 5-8 are about the matrices in Problems 1 and 5. 


1 Reduce these matrices to their ordinary echelon forms U : 
1 252) 4.6 2.4 2 
AS) Ale 22. 935769 B=;0 4 4 
OO Te Slee 3 0 8 8 
Which are the free variables and which are the pivot variables ? 


2 For the matrices in Problem 1, find a special solution for each free variable. (Set the 
free variable to 1. Set the other free variables to zero.) 


3 By combining the special solutions in Problem 2, describe every solution to Av = O 
and Bu = 0. The nullspace contains only v = O when there are no 


4 By further row operations on each U in Problem 1, find the reduced echelon form R. 
True or false: The nullspace of R equals the nullspace of U. 


5 By row operations reduce this new A and B to triangular echelon form U. Write down 
a 2 by 2 lower triangular L such that B = LU. 


af iae ss _f-1 35 
oe 6 sa p=|7) 6 at 
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6 For the same A and B, find the special solutions to Av =0 and Bu =O. For an m by 
m matrix, the number of pivot variables plus the number of free variables is 


7 In Problem 5, describe the nullspaces of A and B in two ways. Give the equations for 
the plane or the line, and give all vectors v that satisfy those equations as combinations 
of the special solutions. 


8 Reduce the echelon forms U in Problem 5 to R. For each R draw a box around the 
identity matrix that is in the pivot rows and pivot columns. 


Questions 9-17 are about free variables and pivot variables. 
9 True or false (with reason if true or example to show it is false) : 


(a) A square matrix has no free variables. 
(b) An invertible matrix has no free variables. 
(c) An m by n matrix has no more than n pivot variables. 


(d) An m by n matrix has no more than m pivot variables. 
10 Construct 3 by 3 matrices A to satisfy these requirements (if possible) : 


(a) A has no zero entries but U = I. 
(b) A has no zero entries but R = I. 
(c) A has no zero entries but R = U. 
(d) A=U =2R. 
11. Putas many 1’s as possible in a 4 by 7 echelon matrix U whose pivot columns are 
(a) 2, 4,5 
(b) 1, 3, 6,7 
(c) 4and 6. 


12 Put as many 1’s as possible in a 4 by 8 reduced echelon matrix R so that the free 
columns are 


(a) 2,4, 5,6 
(b) 1,3, 6,7, 8. 


13 Suppose column 4 of a3 by 5 matrix is all zero. Then vq is certainly a variable. 
The special solution for this variable is the vector s = 


14 Suppose the first and last columns of a 3 by 5 matrix are the same (not zero). Then 
______ is a free variable. Find the special solution for this variable. 


15 Suppose an m by n matrix has r pivots. The number of special solutions is 
The nullspace contains only v = 0 whenr = ____. The column space is all of R™ 
when r = 
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16 = The nullspace of a 5 by 5 matrix contains only » = O when the matrix has 
pivots. The column space is R° when there are ___ pivots. Explain why. 


17 The equation « — 3y — z = 0 determines a plane in R®. What is the matrix A in 
this equation? Which are the free variables? The special solutions are (3, 1,0) and 


18 (Recommended) The plane x — 3y — z = 12 is parallel to the plane x — 3y — z = 0 
in Problem 17. One particular point on this plane is (12,0, 0). All points on the plane 
have the form (fill in the first components) 


RC 8 

II 
oo 
+ 
< 
or 
of} 
x 
— 


19 Prove that U and A = LU have the same nullspace when L is invertible: 


If Uv =0 then LUV =0. If LUv =0, howdo you know Uv = 0? 


20 = Suppose column 1 + column 3 + column 5 = O ina 4 by 5 matrix with four pivots. 
Which column is sure to have no pivot (and which variable is free)? What is the 
special solution ? What is the nullspace ? 


Questions 21-28 ask for matrices (if possible) with specific properties. 


21 + Construct a matrix whose nullspace consists of all combinations of (2,2,1,0) and 
(3,1,05:1): 


22 Construct a matrix whose nullspace consists of all multiples of (4, 3, 2,1). 


23 Construct a matrix whose column space contains (1, 1, 5) and (0, 3, 1) and whose nullspace 
contains (1,1, 2). 


24 Construct a matrix whose column space contains (1, 1,0) and (0, 1, 1) and whose nullspace 
contains (1,0, 1) and (0, 0, 1). 


25 Construct a matrix whose column space contains (1, 1,1) and whose nullspace is the 
line of multiples of (1,1, 1,1). 


26 ~=Construct a 2 by 2 matrix whose nullspace equals its column space. This is possible. 
27 ~=Why does no 3 by 3 matrix have a nullspace that equals its column space ? 


28 (Important) If AB = 0 then the column space of B is contained in the of A. 
Give an example of A and B. 


29 ~=The reduced form R of a 3 by 3 matrix with randomly chosen entries is almost sure to 
be ____. What reduced form R is virtually certain if the random A is 4 by 3? 
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30 


31 


32 


33 


34 


35 


36 


37 


Show by example that these three statements are generally false : 


(a) Aand A™ have the same nullspace. 
(b) Aand A™ have the same free variables. 
(c) If Ris the reduced form of A then R™ is the reduced form of A’. 


If the nullspace of A consists of all multiples of vy = (2,1,0,1), how many pivots 
appear in U ? Whatis R? 


If the special solutions to Rv = 0 are in the columns of these NV, go backward to find 
the nonzero rows of the reduced matrices R: 


2 3 0 
No= |b 0 and N=] 0 and N= (empty 3 by 1). 
Ose 1 


(a) What are the five 2 by 2 reduced echelon matrices R whose entries are all 0’s and 
1’s? 


(b) What are the eight 1 by 3 matrices containing only 0’s and 1’s? Are all eight of 
them reduced echelon matrices R? 


Explain why A and —A always have the same reduced echelon form R. 
Challenge Problems 


If A is 4 by 4 and invertible, describe all vectors in the nullspace of the 4 by 8 matrix 
B=[A A). 


How is the nullspace N(C) related to the spaces N(A) and N(B), if C = = ? 


B 
Kirchhoff’s Law says that current in = current out at every node. This network has 
six currents y,,...,Y6 (the arrows show the positive direction, each y; could be 


positive or negative). Find the four equations Ay = O for Kirchhoff’s Law at the 
four nodes. Reduce to Uy = 0. Find three special solutions in the nullspace of A. 


Y1 
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5.3. The Complete Solution to Av = b 


Tosolve Av = b by elimination, include b as a new column next to the n columns of A. This 
“augmented matrix” is [ A bl. When the steps of elimination operate on A (the left side 
of the equations), they also operate on the right side b. So we always keep correct equations, 
and they become simple to solve. 

There are still r pivot columns and n — r free columns in A. Each free column still 
gives a special solution to Av = O. The new question is to find a particular solution vp 
with Av, = 6b. That solution will exist unless elimination leads to an impossible equation 
(a zero row on the left side, a nonzero number on the right side). Then back substitution 
finds v,. Every solution to Av = b has the form vp + Un. 

In the process of elimination, we discover the rank of A. This is the number of pivots. 
The rank is also the number of nonzero rows after elimination. We start with m equations 
Av = O, but the true number of equations is the rank r. We don’t want to count repeated 
rows, or rows that are combinations of previous rows, or zero rows. You will soon see that 
r counts the number of independent rows. And the great fact, still to prove and explain, 
is that the rank r also counts the number of independent columns : 


number of pivots = number of independent rows = number of independent columns. 


This is part of the Fundamental Theorem of Linear Algebra in Section 5.5. 
An example of Av = 6 will make the possibilities clear. 


a ae 20) De 1 has the i 22 0) 2 
0 0 1 4) ])*|= |6] augmented |0 0 1 4 6/=[A 5]. 
Lessa Os) 7 tae 1. Be Ge °F 

4 


The augmented matrix is just [ A bj. When we apply the usual elimination steps to A 
and b, all the equations stay correct. Those steps produce R and d. 

In this example we subtract row | from row 3 and then subtract row 2 from row 3. 
This produces a row of zeros in R, and it changes b to a new right side d = (1,6, 0): 


i nie ah 1 has the Auer 8 ea 
0 0 1 4/])*|= |6] augmented |0 0 1 4 6] =[R d]. 
OOC OC Oe 0 eee (0 0 0 0 
4 
That very last zero is crucial. The third equation has become 0 = 0, and we are safe. 


The equations can be solved. In the original matrix A, the first row plus the second row 
equals the third row. If the equations are consistent, this must be true on the right side 
of the equations also ! The all-important property on the right side was 1 + 6 = 7. 

Here are the same augmented matrices for any vector b = (by, be, b3): 


ES SC at NIUE are 
[A b]=]0 0 1 4 bj —]0 01 4 b = [Rod] 
I <3> 6S oe 00 0 0 b-—b —be 
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Now we get 0 = 0 in the third equation provided b3 — b; — bg = 0. This is b; + bg = bs. 
The example satisfied this requirement with 1 + 6 = 7. You see how elimination on [A }] 
brings out the test on b for Av = b to be solvable. 


One Particular Solution 


For an easy solution vp, choose the free variables to be vg = v4 = 0. Then the two 
nonzero equations give the two pivot variables v; = 1 and v3 = 6. Our particular solution 
to Av = b (and also Rv = d) is vp = (1,0,6,0). This particular solution is my favorite : 
free variables are zero, pivot variables come from d. The method always works. 
For Rv = d to have a solution, zero rows in R must also be zero ind. 
When I/ isin the pivot rows and columns of R, the pivot variables are in d: 


1 
3h OL 22 A 5 
0 Pivot variables 1, 6 
Rv, =d 001 4 F 
6 Free variables 0, 0 
OF 207 20%"'0 0 


Notice how we choose the free variables (as zero) and solve for the pivot variables. After 
the row reduction to R, those steps are quick. When the free variables are zero, the pivot 
variables for v, are already seen in the right side vector d. 


Uparticular The particular solution vz solves Av, =b 


Unullspace The n — r special solutions solve Av, = 0. 


That particular solution to Av = band Rv = d is (1,0,6,0). The two special (null) 
solutions to Rv = 0 come from the two free columns of R, by reversing signs of 3, 2, and 4. 
Please notice the form I use for the complete solution v, + v,, to Av = b: 


Complete solution : . . 
one Up V=Uptin=| ¢— | +2] 9 | +4] _4 
many vy, " 0 ' 


Question Suppose A is a square invertible matrix, m = n = r. What are vp and v,? 
Answer If A~! exists, the particular solution is the one and only solution v = A~!b. 
There are no special solutions or free variables. R = I has no zero rows. The only vector 
in the nullspace is v,, = 0. The complete solution is v = v, + v, = A~'b+ 0. 

This was the situation in Chapter 4. We didn’t mention the nullspace in that chapter. 
N(A) contained only the zero vector. Reduction goes from [A b] to [I A~1b]. The 
original Av = b is reduced all the way to v = A~‘b which is d. This is a special case 
here, but square invertible matrices are the ones we see most often in practice. So they got 
their own chapter at the start of linear algebra. 
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For small examples we can reduce [A 6] to [R d]. For a large matrix, 
MATLAB does it better. One particular solution (not necessarily ours) is A\b from the 
backslash command. Here is an example with full column rank. Both columns have pivots. 


Example 1 Find the condition on (bj, bz, b3) for Av = b to be solvable, if 


fi. by 
A= | A 22.) and b=" ||" 
at a8 bs 


This condition puts b in the column space of A. Find the complete v = v, + Un. 


Solution Use the augmented matrix, with its extra column b. Subtract row | of [A b] 
from row 2, and add 2 times row | to row 3 to reach [R d| : 


1 1 by 1 1 bh 1 O 2b; — be 
i 2 bg a 4 0 1 be = by =} od bo = by 
—2 -3 bsg 0 —-1 63+ 2b; 0 0 bg +6, + be 


The last equation is 0 = 0 provided bg + b; + bg = 0. This is the condition that puts 
b in the column space; then Av = b will be solvable. The rows of A add to the zero row. 
So for consistency (these are equations!) the entries of b must also add to zero. This example 
has no free variables since n — r = 2 — 2. Therefore no special solutions. The rank is r = n 
so the only null solution is v,, = 0. The unique particular solution to Av = b and Rv = d 
is at the top of the augmented column d: 


‘ = _ | 2b, — be 0 
Only one solution v=tmn=| J+[o]: 


If b3 + b; + bg is not zero, there is no solution to Av = b (vp, doesn’t exist). 

This example is typical of an extremely important case: A has full column rank. 
Every column has a pivot. The rank is r = n. The matrix is tall and thin (m > n). 
Elimination puts J at the top, when A is reduced to R with rank n: 


(1) 


Full columineanks ee I - n by n identity matrix 


0 m — n rows of zeros 


There are no free columns or free variables. The nullspace is Z. 
We will collect together the different ways of recognizing this type of matrix. 


Every matrix A with full column rank (r = 7) has all these properties: 


1. All columns of A are pivot columns. They are independent. 
There are no free variables or special solutions. 


Only the zero vector v = 0 solves Av = 0 and is in the nullspace N(A). 


2a AN 


If Av = bhas a solution (it might not) then it has only one solution. 
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In the essential language of the next section, A has independent columns if r = n. 
Av = 0 only happens when v = O. Eventually we will add one more fact to the list : 
The square matrix A’ A is invertible when the columns are independent. 

In Example 1 the nullspace of A (and R) has shrunk to the zero vector. The solution to 
Av = bis unique (if it exists). There will be m — n (here 3 — 2) zero rows in R. There are 
m — n conditions on b to have 0 = 0 in those rows. Then 6 is in the column space. 

With full column rank, Av = 6 has one solution or no solution: m > n is overdetermined. 


The Complete Solution 


The other extreme case is full row rank. Now Av = b has one or infinitely many solutions. 
In this case A must be short and wide (m < n). A matrix has full row rank if r = m 
(“independent rows’). Every row has a pivot, and here is an example. 


Example 2. There are n = 3 unknowns but only m = 2 equations: 


Mi iat) fee ae eae 


Full row rank Pope eo 


(rank r = m = 2) 


These are two planes in xyz space. The planes are not parallel so they intersect in a line. 
This line of solutions is exactly what elimination will find. The particular solution will 
be one point on the line. Adding the nullspace vectors v,, will move us along the line. 
Then v = vp + Uy gives the whole line of solutions. 


We find vp and v,, by elimination on | A 6b]. Subtract row 1 from row 2 and then 
subtract row 2 from row 1: 


eee ‘ee eee i rOs. ghee 
E Del t]>[4 1 =2 hese ito t |= [2 4) 


The particular solution has free variable v3 = 0. The special solution has v3 = 1: 


Uparticular Comes directly from d on the right side: vp, = (2, 1,0) 
s comes from the third column (free column) of R: s = (—3, 2,1) 
It is wise to check that v, and s satisfy the original equations Av, = b and As = 0: 
21 = 3 —3+2+1 = 0 
24+2 = 4 —-3+4-1 = 0 


The nullspace solution v,, is any multiple of s. It moves along the line of solutions, 
starting at Uparticular- Please notice again how to write the answer : 


Complete solution V=UptUn= | 1 | +43 
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Line of solutions to Av = b 


Line of solutions to Av = O 


Figure 5.3: Complete solution = one particular solution + all nullspace solutions. 


The line of solutions is drawn in Figure 5.3. Any point on the line could have been chosen 
as the particular solution; we chose the point with v3 = 0. 

The particular solution is not multiplied by an arbitrary constant! The special solution 
is, and you understand why. 

Now we summarize this short wide case of full row rank. Ifm < nthe equations Av = b 
are underdetermined (they have many solutions if they have one). 


Every matrix A with full row rank (r = m) hasall these properties : 


1. All m rows have pivots, and R has no zero rows. 
Av = bhas a solution for every right side b. 


The column space is the whole space R™. 


PF © KN 


There are n — r = n — m special solutions in the nullspace of A. 


In this case with m pivots, the rows are “linearly independent.” We are more than ready 
for the idea of linear independence, as soon as we summarize the four possibilities— 
which depend on the rank. Notice how r, m, n are the critical numbers. 


The four possibilities for linear equations depend on the rank r. 


r=m and r=n Square and invertible Av =b_ has 1 solution 
r=m and r<n Short and wide Av = Db has o solutions 
r<m and r=n Tall and thin Av = Db has 0 or 1 solution 
r<m and r<n Not full rank Av = 0b _shas 0 or oo solutions 


The reduced F will fall in the same category as the matrix A. They have the same rank. 
In case the pivot columns happen to come first, we can display these four possibilities for 


R. For Rv = d and Av = OD tto be solvable, d must end in m — r zeros. 
I Tek: 
Four types ae | [I F] a ] a a ] 
Their ranks r=mMm=n Tr=mMmM<n r=n<cm recmr<n 


Cases 1 and 2 have full row rank r = m. Cases 1 and 3 have full column rank r = n. 
Case 4 is the most general in theory and it is the least common in practice. 
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= REVIEW OF THE KEY IDEAS & 


1. The rank r is the number of pivots. The reduced matrix R has m — r zero rows. 
. Av = bis solvable if and only if the last m — r equations in Rv = d are 0 = 0. 
. One particular solution v, has all free variables equal to zero. 

. The r pivot variables are determined after the n — r free variables are chosen. 


. Full column rank r = n means no free variables: one solution or no solution. 


aA un &> WY WY 


. Full row rank r = m means one solution if m = n or infinitely many if m <n. 


= WORKED EXAMPLES #8 


5.3 A This question connects elimination (pivot columns and back substitution) to 
column space-nullspace-rank-solvability (the full picture). A is 3 by 4 with rank 2: 


v1 + 2v9+ 3034+ 5v4 = by 
Av=6b is 2v,+4vo+ 8v34+ 1204 = bo 
3v1 + 6v2 + Tv3 + 1304 = bg 


. Reduce [A b]|to[U ce], so that Av = b becomes a triangular system Uv = c. 
. Find the condition on by, bz, b3 for Av = b to have a solution. 

. Describe the column space of A. Which plane in R® is the column space ? 

. Describe the nullspace of A. What are the special solutions in R+? 


un bk WN = 


. Find a particular solution to Av = (0,6, —6) and then the complete solution. 


Solution 


1. The multipliers in elimination are 2 and 3 and —1. They take [A 6]into[U cl]. 


123 5bi 127 023% Coa be de 28 53) Sosy 
24812be}/7/00 2 2) be2-—2b1}4]0 0 2 2) bo—2by 
3.6 7 13 bg 0 0 —2 —2 | bg — 3b, 0 0 0 0} bg + be — 5by 


2. The last equation shows the solvability condition bg + bz — 5b, = 0. Then 0 = 0. 
3. First description: The column space is the plane containing all combinations of the 
pivot columns (1,2,3) and (3,8,7). Those columns are in A, not in U or R. 
Second description: The column space contains all vectors with b3 + bz — 5b; = 0. 
That makes Av = b solvable. All columns of A pass this test bz + bz — 5b; = 0. This 
is the equation for the plane in the first description of the column space. 
4. The special solutions have free variables v2 = 1,v4 = 0 and then vg = 0,4 = L: 
81 = (—2,1,0,0) and sg = (—2,0, —1, 1). The nullspace contains all cs) + c28o. 
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5. One particular solution v, has free variables = zero. Back substitute in Uv = c: 


Particular solution to Av, = b= ( &, —6) | 
This vector b satisfies b3 + bg — 5b; = 0 Up = 3 
The complete solution is v = vp + Un. 0 


5.3B Find the complete solution v = v, + v,, by forward elimination on {A 0b]: 


2825 DO 4 
7 ee tee eon aban = |e 
rie ee ees lea 10 
V4 
Find numbers yj, y2,y3 so that y; (row 1) + y2(row2) + y3(row3) = zero row. 


Check that b = (4, 2,10) satisfies the condition y1b1 + y2b2 + y3b3 = 0. Why is this 
the condition for the equations to be solvable and b to be in the column space? 


Solution Forward eliminationon[A_ 6] producesazerorow in [Uc]. The third equation 
becomes 0 = 0. The equations are consistent (and solvable because 0 = 0): 

A 2d 0, oA i baa id Oat Ly 2e Ah 20. 3A 

2 OA AM BoD — ]|0 0 2 8 -6|}—>+]0 0 2 8 -6 

4 8 6 8 10 00 2 8 -6 0000 0 
Columns | and 3 contain pivots. The variables v2 and v4 are free. If ve = v4 = 0 we can 
solve (back substitution) for the particular solution vp = (7,0, — 30). The 7 and —3 appear 
again if elimination continues all the way to the row reduced [Rd]: 


i 22 2 0) 4 LEDER 0. 34 120-4 7% 

002 8 -6 |—]0 014-3] 35>);0 01 4 -3 

020.0: 30" .0)% °O On Or 40:30 <0 0. 8020: F000 
For the nullspace part v,, with b = 0, set the free variables v2, v4 to 1,0 and also 0,1: 


Special solutions 81 =(-2, 1 0, Ojand s2=(4, 0, —4) 1 


Then the complete solution to Av = b (and Rv = d) is Vcomplete = Up + 181 + C282. 


The rows of A produced the zero row from 2(row 1) + (row 2) — (row 3) = (0,0, 0, 0). 
Thus y = (2, 1, — 1). The same combination for b = (4, 2, 10) gives 2(4) + (2) — (10) = 0. 
Combinations that give y? A = zero must also give y"b = zero. Otherwise no solution. 

Later we will say this in different words: y = (2,1, — 1), is in the nullspace of A’. 
Then y will be perpendicular to every b in the column space of A. I am looking ahead... 
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Problem Set 5.3 


1 (Recommended) Execute the six steps of Worked Example 3.4 A to describe the 
column space and nullspace of A and the complete solution to Av = b: 


2.44) 6-24 by 4 
2223 0 2, b3 5 
2 Carry out the same six steps for this matrix A with rank one. You will find two 
conditions on 6;, 62, b3 for Av = b to be solvable. Together these two conditions 
put b into the space. 
1 [2 1 3] 2 1 3 by 10 
A=j]| 3 =|6 3 9 b= | bo | =| 30 
2 4 2 6 bg 20 


Questions 3-15 are about the solution of Av = b. Follow the steps in the text to vp 
and vn. Start from the augmented matrix [A 6 |. 


3 Write the complete solution as v, plus any multiple of s in the nullspace : 


e+ 3y4+ 3z=1 
22 + 6y+9z=5 
—x —3y+3z=5. 


4 Find the complete solution (also called the general solution) to 


1 @12 il 1 
264 8 Yi=|3 
002 4 : 1 


5 Under what condition on bj, bz, b3 is this system solvable? Include 6 as a fourth 
column in elimination. Find all solutions when that condition holds: 


a+ 2y —2z = by 
22 + Sy —4z = bo 
4x + 9y — 8z = bs. 


6 What conditions on 01, bz, b3, bg make each system solvable? Find v in that case: 


t 3 by 1 @ 6 by 
2 4|/[v]_| b 24 6 inne 
2 0 | 3 |- bg 2 5: iF -. bg 
3 9 ba 3 9 12 ba 
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7 


13 


14 


15 


Show by elimination that (61, bz, bs) is in the column space if bz — 2b2 + 4b; = 0. 
1 es ae | 
A=13 8 2 
24 0 


What combination y; (row 1) + y2(row 2) + y3(row 3) gives the zero row ? 


Which vectors (61, b2, bs) are in the column space of A? Which combinations of the 
rows of A give zero? 


Lot 
2 4 
4 8 


NO ee 


1 21 
(a) A=| 2 6 3 (b)s 2A 
5 


In Worked Example 5.3 A, combine the pivot columns of A with the numbers 
—9 and 3 in the particular solution v,. What is that linear combination and why ? 


Construct a 2 by 3 system Av = 6 with particular solution v, = (2,4,0) and 
null (homogeneous) solution v,, = any multiple of (1, 1,1). 


Why can’t a 1 by 3 system have uv, = (2,4,0) and v, = any multiple of (1, 1,1)? 


(a) If Av = b has two solutions v1 and vg, find two solutions to Av = O. 


(b) Then find another solution to Av = b. 
Explain why these are all false: 


(a) The complete solution is any linear combination of vu, and vp. 
(b) A system Av = b has at most one particular solution. 


(c) The solution uv, with all free variables zero is the shortest solution (minimum 
length ||v||). Find a 2 by 2 counterexample. 


(d) If A is invertible there is no solution v,, in the nullspace. 


Suppose column 5 has no pivot. Then vs is a variable. The zero vector (is) 
(is not) the only solution to Av = 0. If Av = b has a solution, then it has 
solutions. 


Suppose row 3 has no pivot. Then that row is . The equation Rv = d is only 
solvable provided . The equation Av = b (is) (is not) (might not be) solvable. 


Questions 16-21 are about matrices of “full rank” r = ™m or r = n. 


16 


The largest possible rank of a 3 by 5 matrix is . Then there is a pivot in 
every of U and R. The solution to Av = 6b (always exists) (is unique). 
The column space of A is . Anexample is A = 


282 


17 


18 


19 


20 


21 


22 


23 


24 
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The largest possible rank of a 6 by 4 matrix is . Then there is a pivot in 
every of U and R. The solution to Av = 6 (always exists) (is unique). 
The nullspace of A is . An example is A = 


Find by elimination the rank of A and also the rank of AT: 


1 4 0 ds) en Sl 
Avan) 2 el 45 and A=] 1 1 2 | (rank depends on q). 
See a0 is aes ot 


Find the rank of A and also of AT A and also of AA: 


2 0 
a=(t 04 wid Aue | i i 
1 2 


Reduce A to its echelon form U. Then find a triangular L so that A = LU. 


Find the complete solution in the form v, + vp, to these full rank systems : 


aty+z=4 


=4 b 
ae aie (b) z-ytz=4. 


If Av = b has infinitely many solutions, why is it impossible for Av = B (new 
right side) to have only one solution? Could Av = B have no solution ? 


Choose the number q so that (if possible) the ranks are (a) 1, (b) 2, (c) 3: 


6 4 2 
Awa | 3 <3 =.) ond Perea 
9 6 qd q q 


Give examples of matrices A for which the number of solutions to Av = bis 


(a) O or 1, depending on b 
(b) oo, regardless of b 

(c) 0 or oo, depending on b 
(d) 1, regardless of b. 
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25 


Write down all known relations between r and m and n if Av = bhas 


(a) no solution for some b 
(b) infinitely many solutions for every b 
(c) exactly one solution for some 5}, no solution for other b 


(d) exactly one solution for every b. 


Questions 26-33 are about Gauss-Jordan elimination (upwards as well as downwards) 
and the reduced echelon matrix R. 


26 


27 


28 


29 


30 


Continue elimination from U to R. Divide rows by pivots so the new pivots are all 1. 
Then produce zeros above those pivots to reach R: 


24 4 24 4 
U=|0 3 6 and U=]|0 3 6 
0 0 0 0 0 5 


Suppose U is square with n pivots (an invertible matrix). Explain why R = I. 


Apply Gauss-Jordan elimination to Uv = 0 and Uv = c. Reach Rv = O and 
Rv=d: 


wol=[oo 40) ™ Mel=[o 54 8]: 


Solve Rv = 0 to find v,, (its free variable is v2 = 1). Solve Rv = d to find Up 
(its free variable is v2 = 0). 


Apply Gauss-Jordan elimination to reduce to Rv = Oand Rv = d: 
3 06 0 3 06 9 
Y O;=|0 0 2 0 and U c}=|002 4 
00 0 0 000 5 
Solve Uv = 0 or Rv = Oto find v,, (free variable = 1). What are the solutions to 


Rv =d? 


Reduce to Uv = c (Gaussian elimination) and then Rv = d (Gauss-Jordan) : 


1023 ie 2 
deo | 1 3 FO » |=] 5 | =6. 
204 9 $ 10 
U4 


Find a particular solution v, and all homogeneous (null) solutions v,,. 
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31 


32 


33 


34 


35 


36 


Find matrices A and B with the given property or explain why you can’t: 
1 


(a) The only solution of Av = | 2 | isu = f | 
3 
0 1 
(b) The only solution of Bu = 1 | 1S;v =") "2 
3 


Reduce [A b] to [R d| and find the complete solution to Av = b: 


and b= andthen b= 


ee eb Ww 
anwr- 

aOwW re 
oo 6& 


oO 


The complete solution to Av = 3 isv= ; | eC 5 } Find A. 


Challenge Problems 


Suppose you know that the 3 by 4 matrix A has the vector s = (2,3, 1,0) as the only 
special solution to Av = 0. 


(a) What is the rank of A and the complete solution to Av = 0? 
(b) What is the exact row reduced echelon form R of A? Good question. 
(c) How do you know that Av = b can be solved for all b? 


If you have this information about the solutions to Av = b for a specific b, what does 
that tell you about the shape of A (mand n)? And possibly about r and b. 


1. There is exactly one solution. 
2. All solutions to Av = b have the form v = [2] ae c[ : | : 
3. There are no solutions. 
4. Allsolutions to Av = b have the form v = [2] +e [| 
5. There are infinitely many solutions. 
Suppose Av = 6b and Cv = b have the same (complete) solutions for every b. 


Is it true that A = C'? 
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5.4 Independence, Basis and Dimension 


This important section is about the true size of a subspace. There are n columns in an m by 
nm matrix. But the true “dimension” of the column space is not necessarily n. The dimension 
is measured by counting independent columns—and we have to say what that means. We 
will see that the true dimension of the column space is the rank r. 

The idea of independence applies to any vectors w1,..., Ww, in any vector space. Most 
of this section concentrates on the subspaces that we know and use—especially the column 
space and the nullspace of A. In the last part we also study “vectors” that are not column 
vectors. They can be matrices, or solutions to differential equations. They can be linearly 
independent (or dependent). First come the key examples using column vectors. 

The goal is to understand a basis : independent vectors that “span the space”. 


Any basis Each vector in the space is a unique combination of the basis vectors. 


We are at the heart of our subject, and we cannot go on without a basis. The four essential 
ideas in this section (with first hints at their meaning) are: 


1. Independent vectors (no extra vectors) 

2. Spanning a space (their combinations produce the whole space) 

3. Basis for a space (independent and spanning : not too many or too few) 
4. Dimension of a space (the number of vectors in each and every basis) 


Bases for Important Spaces 


Here are three examples to show you what a basis looks like (before the definition). 
A basis is a set of vectors that perfectly describes all vectors in the space. ‘Take all 
combinations of the basis vectors to get every vector in the space. 


1. Basis for the column space of A 


A natural choice is the r pivot columns. Their combinations yield all columns. 


2. Basis for the nullspace of A 


A natural choice is the set of n — r special solutions to Av = 0. 


3. Basis for the space of null solutions to Ay" + By’ + Cy = 0 


A natural choice is the pair of solutions y; = e°!* and yo = e°?'. These exponents 
81 and 82 satisfy As? + Bs +C = 0, so y; and yp solve the differential equation. 


If s is a double root of the quadratic, then yz = te®* can be the second member 
of the basis. (Always two y’s for a linear second order equation.) All other solutions 
are combinations of y; and y2. Then y; and y2 span the solution space. 
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The dimension of a space is easy. Just count the number of basis vectors : 
Column space Nullspace Solution space 
Dimension r Dimension n — r Dimension 2 


Those bases were natural choices. They are not at all the only bases. A space has many 
different bases. The column space of this matrix A is the whole space R?. 


1 3% 1. Pivot columns 1 and 2 
A= 25 9 | Bases forC(A) 2. Columns 1 and 3, or columns 2 and 3 
3. Any independent v and w in R? 


The vectors (1,0) and (0, 1) are a perfectly good basis for the column space of this A. 


Linear Independence 


Our first definition of independence is not so conventional, but you are ready for it. 


DEFINITION The columns of A are linearly independent when the only solution to 
Av =0isv =0. No combination Av of the columns is the zero vector, except v = 0. 


The columns are independent when the nullspace NV(A) contains only the zero vector. 
Let me illustrate linear independence (and dependence) with three vectors in R? : 


1. If three vectors are not in the same plane, they are independent. No combination of 
U1, U2, U3 in Figure 5.4 gives zero except the combination 0 uw; + 0u2 +0 U3. 


2. If three vectors w 1, w2, w3 are in the same plane, they are dependent. 


This idea of independence applies to 7 vectors in 12-dimensional space. If they are the 
columns of A, and independent, the nullspace only contains v = 0. None of the vectors is a 
combination of the other six vectors. 

Now we express the same idea in different words. The following definition of indepen- 
dence will apply to any sequence of vectors in any vector space. When the vectors are the 
columns of A, the two definitions say exactly the same thing. 


U1 
In a plane 
Not in 0 
a plane U2 ee ar UR 
As 
U3 — — WwW 
Wi 


Figure 5.4: Independent vectors ui, u2, wz. Only Ow + Oug + Ousz gives the vector 0. 
Dependent vectors w 1, w2, w3. The combination w; — w2 + ws is (0,0, 0). 
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DEFINITION The sequence of vectors u1,...,Un is linearly independent if the only 
combination that gives the zero vector is 0w,; + Ow) +---+0uUn. 


@1U, + LQUg +--+ +2nUn =O only happens when all x’s are zero. (1) 


If a combination gives 0, when the x’s are not all zero, the vectors are dependent. 


Correct language: “The sequence of vectors is linearly independent.’ Acceptable 
shortcut : “The vectors are independent.” Not acceptable: “The matrix is independent.” 


A sequence of vectors is either dependent or independent. They can be combined to give 
the zero vector (with nonzero z’s) or they can’t. So the key question is : Which combinations 
of the vectors give zero? We begin with some small examples in R? : 


(a) The vectors (1,0) and (1, 0.00001) are independent. 

(b) The vectors (1, 1) and (—1, —1) on the same line through (0, 0) are dependent. 
(c) The vectors (1, 1) and (0,0) are dependent because of the zero vector. 

(d) In R?, any three vectors (a, b) and (c, d) and (e, f) are dependent. 


The columns of A are dependent exactly when there is a nonzero vector in the nullspace. 

If one of the w’s is the zero vector, independence has no chance. Why not? 

Three vectors in R? cannot be independent! The matrix A with those three columns 
must have a free variable and then a special solution As = 0. The nullspace is larger than 
Z. For three vectors in R®, we put them in a matrix and try to solve Av = 0. 


Example 1 The columns of this A are dependent. The nonzero vector v has Av = 0. 


10 3 —3 1 0 3 0 
Av=]2 1 5 1 is —3/2]+1]) 1 /+1]/5 ]=] 0 
1 0 3 1 1 0 3 0 


The rank is only r = 2. Independent columns produce full column rank r = n. 
In that matrix the rows are also dependent. Row 1 minus row 3 is the zero row. For a 
square matrix, we will show that dependent columns imply dependent rows. 


Question How to find that solution to Av = 0? The systematic way is elimination. 


Lye OKs eG) 3 
A=|2 1 5 | reducestoR=]0 1 —1 
«0-3 0% 6 0 


The solution v = (—3, 1,1) was exactly the special solution. It shows how the free column 
(column 3) is a combination of the pivot columns. That kills independence! 


The columns of A are independent when the rank is r = n: 


Bull epluenes neni sh: n pivots and no free variables. Only v = 0 is in the nullspace. 
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Dependent columns if n > m. Suppose seven columns have five components each 
(m = 5 is less than n = 7). Then the columns must be dependent. Any seven vectors 
from R° are dependent. The rank of A cannot be larger than 5. There cannot be more than 
five pivots in five rows. Av = O has at least 7 — 5 = 2 free variables, so it has nonzero 
solutions—which means that the columns are dependent. 


Any set of n vectors in R™ must be linearly dependent if n > m. 


This type of matrix has more columns than rows—it is short and wide. The columns are 
certainly dependent if n > m, because Av = O has a nonzero solution. Elimination will 
reveal the r pivot columns. Those r pivot columns are independent. 


Note Another way to describe linear dependence is this : “One vector is a combination of 
the other vectors.” That sounds clear. Why don’t we say this? Our definition was longer: 
“Some combination gives the zero vector, other than the trivial combination with every 
v = 0.” Our definition doesn’t pick out one particular vector as guilty. 


All columns of A are treated the same. We look at Av = O, and it has a nonzero 
solution or it hasn’t. In the end that is better than asking if the last column (or the first, or 
a column in the middle) is a combination of the others. 


Spanning a Subspace 


The first subspace in this book was the column space. Starting with columns @j,..., @,, the 
subspace was filled out by including all their v combinations via; + --- + UnGn. 
The column space consists of all combinations Av of the columns. We now introduce the 
single word “span” to describe this : The column space is spanned by the columns. 


DEFINITION A set of vectors spans a space if their linear combinations fill the space. 


The columns of a matrix span its column space. They might be dependent. 


Example 2) wu; = | ' and ug = : span the full two-dimensional space R?. 


0 


Example3 w; = | ; | = f 


F U3 = : also span the full space R*. 
1 —1 Shee ED 
Example 4 w, = 1 and w2 = = only span a line in R“. So does wy, alone. 


Think of two vectors coming out from (0, 0,0) in 3-dimensional space. Generally they 
span a plane. Your mind fills in that plane by taking linear combinations. Mathematically 
you know other possibilities : two vectors could span a line, three vectors could span all of 
R®, or they could span only a plane or a line or Z. 
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It is possible that three vectors span only a line in R®, or ten vectors span only a plane. 
They are certainly not independent! 

The columns span the column space. Here is a new subspace—spanned by the rows. 
The combinations of the rows produce the “row space””. 


DEFINITION The row space of a matrix is the subspace of R” spanned by the rows. 
The row space of Ais C( A‘). It is the column space of A‘. 


The rows of an m by n matrix have n components. They are vectors in R”—or they 
would be if they were written as column vectors. There is a quick way to fix that: 
Transpose the matrix. Instead of the rows of A, look at the columns of AT, Same numbers, 
but now in the column space of A’. This row space C(A™) is a subspace of R”. 


Example 5 The column space of A is a plane. The row space is all of R?. 


Dt 2 aco 


1 4 
A= d T = , —t a 
A - : and A | 47 5 | Here m = 3 andn = 2 


The row space is spanned by the three rows of A (which are columns of A‘). The columns 
are in R™ spanning the column space. Same numbers, different vectors, different spaces. 


A Basis for a Vector Space 


Two vectors can’t span all of R°, even if they are independent. Four vectors can’t be 
independent, even if they span R°. We want enough independent vectors to span the 
space (and not more). A “basis” is just right. 


DEFINITION A basis for a vector space is a sequence of vectors with two properties : 


The basis vectors are linearly independent and they span the space. 


This combination of properties is fundamental to linear algebra. Every vector wu in the space 
is a combination of the basis vectors, because they span the space. More than that, the com- 
bination that produces wu is unique, because the basis vectors w1,..., Un are independent: 


Reason: Suppose u = a, u,+---+an,U, and also u = b)u; +---+bnUn. By subtraction 
(a, — by)ur +--+ + (Gn — bn) Un is the zero vector. From the independence of the w’s, 
each a; — b; = 0. Hence a; = b;, and there are not two ways to produce wu. 
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Example 6 The columns of the identity matrix I are the “standard basis” for R”. 


The basis vectors 2 = ; and j= ; are independent. They span R?. 


Everybody thinks of this basis first. The vector 2 goes across and 7 goes straight up. 
The columns of the 3 by 3 identity matrix are the standard basis 7, 7, k for R’. 
Now we find many other bases (infinitely many). The basis is not unique ! 


Example 7 (Important) The columns of every invertible n by n matrix give a basis for R” : 


Invertible matrix 1 0 0 Singular matrix dee Oca | 
Independentcolumns A=] 1 1 0 Dependentcolumns B=] 1 1 2 
Column space is R? 111 Column space 4 R? is ae 


The only solution to Av = 0 is v = A~!0 = O. The columns are independent. They span 
the whole space R"”—because every vector b is a combination of the columns. Av = b can 
always be solved by v = A~!b. Do you see how everything comes together for invertible 
matrices? Here it is in one sentence: 


The vectors v1, ..., Un are a basis for R” exactly when they are the columns of an 


n by n invertible matrix. The vector space R” has infinitely many different bases. 


When the columns are dependent, we keep only the pivot columns—the first two columns 
of B above, with its two pivots. They are independent and they span the column space. 


The pivot columns of A are a basis for its column space. The pivot rows are a basis 
for the row space. The pivot rows of the reduced R are also a basis for the row space. 


Example 8 = This matrix is not invertible. Its columns are not a basis for anything ! 


One pivot column _ {| 2 4 em fe ae 
One pivot row (r = 1) a=(5 g | reduces to R= | 4 ae 


Column 1 of A is the pivot column. That column alone is a basis for its column space. 
Column 1 of R is not a basis for the column space of A. That column (1,0) in R does 
not even belong to the column space of A. Elimination changes column spaces. (But the 
dimension remains the same: here dimension = 1.) 

The row space of A is the same as the row space of R. It contains (2, 4) and (1, 2) and all 
other multiples of those vectors. As always, there are infinitely many bases to choose from. 
One natural choice is to pick the nonzero rows of R (rows with a pivot). So this matrix A 
with rank one has only one vector in the basis: 


Basis for the column space: | : . Basis for the row space: : | F 
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Example 9 Find bases for the column and row spaces of this rank two matrix: 


1 2 3 
R= | 10: @ A 
0 0 0 


oOo - © 


Columns 1 and 3 are the pivot columns. They are a basis for the column space (of R!). 
The vectors in that column space all have the form b = (z,y,0). This space is the 
“xy plane” inside the full xyz space. That plane is not R’, it is a subspace of R®. 
Columns 2 and 3 are also a basis for the same column space. Which pairs of columns of R 
are not a basis for its column space? 

The row space of R is a subspace of R*. The simplest basis for that row space is the 
two nonzero rows of R. The third row (the zero vector) is in the row space too. But it is 
not in a basis for the row space. The basis vectors must be independent. 


Question Given five vectors in R’, how do you find a basis for the space they span? 


First answer Make them the rows of A, and eliminate to find the nonzero rows of R. 

Second answer Put the five vectors into the columns of A. Eliminate to find the pivot 
columns (of A not R). Could another basis have more vectors, or fewer? This question 
has a good answer: No! All bases for a vector space contain the same number of vectors. 


Dimension of a Vector Space 


The number of vectors, in any and every basis, is the “dimension” of the space. 


We have to prove what was stated above. There are many choices for the basis vectors, but 
the number of basis vectors doesn’t change. 


If ui,...,Um and w1,..., W,, are both bases for the same vector space, then m = n. 


Proof Suppose that there are more w’s than u’s. From n > m we want to reach a con- 
tradiction. The w’s are a basis, so w 1 must be a combination of the w’s. If w 1 equals 
Q11U1 + +--+ @miUm,; this is the first column of a matrix multiplication UA: 
Each w isa @11 Gn 
combination W, Wo... Wn| = | U1...Um : : =UA. 
9 

of the w’s Qml1 Gmn 
We don’t know each number a;;, but we know the shape of A (it is m by n). The second 
vector wy is also a combination of the w’s. The coefficients in that combination fill the 
second column of A. The key is that A has a row for every w and a column for every w. 
A is a short wide matrix, since n > m. So Av = O has a nonzero solution. 

Av = 0 gives UAv = 0 whichis Wv = 0. A combination of the w’s gives zero ! 
Then the w’s could not be a basis—our assumption n > m is not possible for two bases. 
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If m > n we exchange the w’s and w’s and repeat the same steps. The only way to avoid 
a contradiction is to have m = n. This completes the proof that m = n. 


The number of basis vectors depends on the space—not on a particular basis. The number 
is the same for every basis, and it counts the “degrees of freedom” in the space. The dimen- 
sion of the space R” is mn. We now introduce the important word dimension 
for other vector spaces too. 


DEFINITION The dimension of a space is the number of vectors in every basis. 


This matches our intuition. The line through u = (1, 5, 2) has dimension one. It is a subspace 
with this one vector wu in its basis. Perpendicular to that line is the plane 
x + 5y + 2z = 0. This plane has dimension 2. To prove it, we find a basis (—5, 1,0) 
and (—2, 0,1). The dimension is 2 because the basis contains two vectors. 

The plane is the nullspace of the matrix A = [ Loe? | , which has two free variables. 
Our basis vectors (—5, 1,0) and (—2, 0, 1) are the “special solutions” to Av = 0. Then — r 
special solutions give a basis for the nullspace, so the dimension of N(A) is n — r. 


Note about the language of linear algebra We never say “the rank of a space” or “the 
dimension of a basis” or “the basis of a matrix”. Those terms have no meaning. It is the 
dimension of the column space that equals the rank of the matrix. 


Bases for Matrix Spaces and Function Spaces 


The words “independence” and “basis” and “dimension” are not at all restricted to column 
vectors. We can ask whether three matrices A,, Ag, A3 are independent. When they are in 
the space of all 3 by 4 matrices, some combination might give the zero matrix. We can also 
ask the dimension of the full 3 by 4 matrix space. (It is 12.) 

In differential equations, d*y/dx? = y has a space of solutions. One basis is y = e” and 
y =e”. Counting the basis functions gives the dimension 2 for the space of all solutions. 
(The dimension is 2 because of the second derivative.) 

Matrix spaces and function spaces may look a little strange after R”. But in some way, 
you haven’t got the ideas of basis and dimension straight until you can apply them to “vec- 
tors” other than column vectors. 


Example 10 Find a basis for the space of 3 by 3 symmetric matrices. 


The basis vectors will be matrices ! We need enough to span the space (then every A = 
A? is a combination). The matrices must be independent (combinations don’t give the zero 
matrix). Here is one basis for the symmetric matrices (many other bases). 
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You could write every A = AT as a combination of those six matrices. What coefficients 
would produce 1, 4, 5 and 4, 2, 8 and 5, 8, 9 in the rows? There is only one way to do 
this. The six matrices are independent. The dimension of symmetric matrix space (3 by 3 
matrices) is 6. 

To push this further, think about the space of all n by n matrices. One possible basis uses 
matrices that have only a single nonzero entry (that entry is 1). There are n? positions for 
that 1, so there are n? basis matrices: 


The dimension of the whole n by n matrix space is n?. 


The dimension of the subspace of upper triangular matrices is $n? + $n. 
The dimension of the subspace of diagonal matrices is n. 


The dimension of the subspace of symmetric matrices is $n? + $n (why ?). 


Function spaces The equations d?y/dt? = 0 and d?y/dt? = —y and d?y/dt? = y 
involve the second derivative. In calculus we solve to find the functions y(t) : 


y” =0 is solved by any linear function y = ct + d 
y” =-—y_ is solved by any combination y = csint + dcost 
y’ =y is solved by any combination y = ce’ + de~’. 


That solution space for y’’ = —y has two basis functions: sint and cost. The space for 
y” = Ohas t and 1. It is the “nullspace” of the second derivative! The dimension is 2 in 
each case (these are second-order equations). We are finding the null solutions y,,. 

The solutions of y” = 2 don’t form a subspace—the right side b = 2 is not zero. A 
particular solution is y = t?. The complete solution is y = Yp + Yn = t? + ct +d. 

That complete solution is one particular solution plus any function in the nullspace. A 
linear differential equation is like a linear matrix equation Av = 6b. But we solve it by 
calculus instead of linear algebra. 

We end here with the space Z that contains only the zero vector. The dimension of this 
space is zero. The empty set (containing no vectors) is a basis for Z. We can never allow the 
zero vector into a basis, because then linear independence is lost. 


= REVIEW OF THE KEY IDEAS #® 


1. The columns of A are independent if v = O is the only solution to Av = 0. 


2. The vectors u;,...,U, Span a space if their combinations fill that space. Spanning 
vectors can be dependent or independent. 


3. A basis consists of linearly independent vectors that span the space. Every vector 
in the space is a unique combination of the basis vectors. 
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4. All bases for a space have the same number of vectors. This number of vectors in a 
basis is the dimension of the space. 


5. The pivot columns are one basis for the column space. The dimension is the rank r. 


6. The n — r special solutions will be seen as a basis for the nullspace. 


= WORKED EXAMPLES #8 


5.4 A Start with the vectors u; = (1,2,0) and wg = (2,3,0). (a) Are they linearly 
independent? (b) Are they a basis for any space? (c) What space V do they span? 
(d) What is the dimension of V?_ (e) Which matrices A have V as their column space? 
(f) Which matrices have V as their nullspace? 


Solution 
(a) uw, and uz are independent—the only combination to give 0 is Ow; + Owg. 
(b) Yes, they are a basis for the space they span. 
(c) That space V contains all vectors (x, y, 0). It is the zy plane in R®. 
(d) The dimension of V is 2 since the basis contains two vectors. 


(e) This V is the column space of any 3 by n matrix A of rank 2, if row 3 is all zero. 
In particular A could just have columns w, and we. 


(f) This V is the nullspace of any m by 3 matrix B of rank 1, if every row has the form 
(0, 0, c). In particular take B = [0 0 1]. Then Buy = 0 and Buz = 0. 


5.4 B (Important example) Suppose uj,...,Un is a basis for R” and the n by n 
matrix A is invertible. Show that Au1,..., Au, is also a basis for R”. 


Solution In matrix language: Put the basis vectors uj,...,Un in the columns of an 
invertible(!) matrix U. Then Aw;,..., Aun are the columns of AU. Since A and U are 
invertible, so is AU and its columns give a basis. 

In vector language: Suppose cjAu; +--+: + crpAun = 0. This is Av = O with 
v = CU, +--+ +CpUn. Multiply by A! to reach v = O. Linear independence of the u’s 
forces all c; = 0. This shows that the Aw’s are independent. 

To show that the Aw’s span R”, solve c; Au; +--+: + c,AUn = b. This is the same as 
cjU, +--+: +CpUn = A~'b. Since the u’s are a basis, this must be solvable for all b. 
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Problem Set 5.4 


Questions 1-10 are about linear independence and linear dependence. 


1 Show that w1, w2, w3 are independent but w1, u2, U3, U4 are dependent: 
1 1 1 2 
uy = 0 U2 = 1 U3 = 1 U4 = 3 
0 0 1 4 


Solve cy uy + c2u2 + C3U3 + C44 = O or Ac = O. The w’s go in the columns of A. 


2 (Recommended) Find the largest possible number of independent vectors among 
1 1 1 0 0 
oe —1 “s 0 és 0 = 1 oz 
i 0 2=/_4 3 0 U4 = = U5 = 0 U6 = 
0 0 —l 0 —1 - 


3 Prove that if a = 0 ord = 0 or f = 0 (3 cases), the columns of U are dependent: 


a 6b °c 
We= | s0r Jdiice 
On Wet 
4 If a,d, f in Question 3 are all nonzero, show that the only solution to Uv = 0 is 


v = 0. Then the upper triangular U has independent columns. 
5 Decide the dependence or independence of 


(a) the vectors (1, 3, 2) and (2, 1,3) and (3, 2, 1) 
(b) the vectors (1, —3, 2) and (2, 1, —3) and (—3, 2, 1). 


6 Choose three independent columns of U and A. Then make two other choices. 


ae a ae | 23 41 

re | 10 6 ¥ | 0 8 FO 

elo 8 oa) OF 2-]e @ 2b 

00 0 0 4 6 8 2 
vd If w 1, W2, wz are independent vectors, show that the differences v; = w2 — wg and 
V2 = Wi — ws and v3 = Wi — We are dependent. Find a combination of the v’s that 

gives zero. Which singular matrix gives |v; vg v3] =[w1 we w3|A? 

8 If w 1, W2, ws are independent vectors, show that the sums v; = wz + wg and 


V2 = W, + w3 and v3 = w, + We are independent. (Write c, v1 + CoVv2 + ¢3v3 = O 
in terms of the w’s. Find and solve equations for the c’s, to show they are zero.) 
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9 Suppose 211, U2, U3, U4 are Vectors in R°. 


(a) These four vectors are dependent because __ 
(b) The two vectors u; and ue will be dependent if 


(c) The vectors w and (0,0, 0) are dependent because 


10 Find two independent vectors on the plane x + 2y —3z-t = Oin R‘. Then find three 
independent vectors. Why not four? This plane is the nullspace of what matrix? 


Questions 11-14 are about the space spanned by a set of vectors. Take all linear com- 
binations of the vectors, to find the space they span. 


11. Describe the subspace of R® (is it a line or plane or R®?) spanned by 
(a) the two vectors (1,1, —1) and (—1, —1, 1) 
(b) the three vectors (0, 1,1) and (1, 1,0) and (0, 0, 0) 
(c) all vectors in R® with whole number components 


(d) all vectors with positive components. 


12 The vector b is in the subspace spanned by the columns of A when has a 
solution. The vector c is in the row space of A when has a solution. 


True or false : If the zero vector is in the row space, the rows are dependent. 


13 Find the dimensions of these 4 spaces. Which two of the spaces are the same? 
(a) column space of A (b) column space of U (c) row space of A (d) row space 
of U 
1 1 0 Le eed 
Ae 3 1 ands VU | cOne2> 41 
Siva 0 0 0 


14 v+w and v — w are combinations of v and w. Write v and w as combinations of 
v + w and v — w. The two pairs of vectors the same space. When are they a 
basis for the same space? 


Questions 15-25 are about the requirements for a basis. 


15 If vi,...,u, are linearly independent, the space they span has dimension 
These vectors are a for that space. If the vectors are the columns of an m by n 
matrix, then ™ is than n. If m = n, that matrix is 


16 Suppose vj, v2,..., 6 are six vectors in R‘. 


(a) Those vectors (do) (do not) (might not) span Re 
(b) Those vectors (are) (are not) (might be) linearly independent. 


(c) Any four of those vectors (are) (are not) (might be) a basis for R‘. 
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17 


18 


19 


20 


21 


22 


23 


24 


Find three different bases for the column space of U = A ‘ ; ; 5 . Then 


find two different bases for the row space of U. 


Find a basis for each of these subspaces of R*: 


(a) All vectors whose components are equal. 

(b) All vectors whose components add to zero. 

(c) All vectors that are perpendicular to (1, 1,0,0) and (1,0, 1, 1). 
(d) The column space and the nullspace of I (4 by 4). 


The columns of A are n vectors from R™. If they are linearly independent, what 
is the rank of A? If they span R™, what is the rank? If they are a basis for R™, 
what then? Looking ahead: The rank r counts the number of columns. 


Find a basis for the plane « — 2y + 3z = 0 in R®. Find a basis for the intersection 
of that plane with the zy plane. Then find a basis for all vectors perpendicular to the 
plane. 


Suppose the columns of a 5 by 5 matrix A are a basis for R°. 


(a) The equation Av = 0 has only the solution v = 0 because __ 


(b) If bis in R° then Av = bis solvable because the basis vectors R°. 
Conclusion: A is invertible. Its rank is 5. Its rows are also a basis for R°. 
Suppose S is a 5-dimensional subspace of R®°. True or false (example if false) : 


(a) Every basis for S can be extended to a basis for R® by adding one more vector. 


(b) Every basis for R® can be reduced to a basis for S by removing one vector. 


U comes from A by subtracting row | from row 3: 


2 
1 
0 


or & 


1. 48. 2 1 
A= iG 1 a and U= | 0 
L 3 2 0 


Find bases for the two column spaces. Find bases for the two row spaces. Find bases 
for the two nullspaces. Which spaces stay fixed in elimination? 


True or false (give a good reason) : 


(a) If the columns of a matrix are dependent, so are the rows. 
(b) The column space of a 2 by 2 matrix is the same as its row space. 
(c) The column space of a 2 by 2 matrix has the same dimension as its row space. 


(d) The columns of a matrix are a basis for the column space. 


298 Chapter 5. Vector Spaces and Subspaces 


25 ‘For which numbers c and d do these matrices have rank 2 ? 


Questions 26-28 are about spaces where the “vectors” are matrices. 

26 ~—‘ Find a basis (and the dimension) for these subspaces of 3 by 3 matrices: 
(a) All diagonal matrices. 
(b) All skew-symmetric matrices (AT = — A). 


27 ~—Construct six linearly independent 3 by 3 echelon matrices U,,..., Ug. What space of 
3 by 3 matrices do they span? 


28 Find a basis for the space of all 2 by 3 matrices whose columns add to zero. 
Find a basis for the subspace whose rows also add to zero. 


Questions 29-32 are about spaces where the “vectors” are functions. 
29 (a) Find all functions that satisfy a = (0; 
(b) Choose a particular function that satisfies ei = 73: 


(c) Find all functions that satisfy ei = 3; 


30 The cosine space F3 contains all combinations y(x~) = Acos«+ Bcos 2x +C cos 3z. 
Find a basis for the subspace S with y(0) = 0. What is the dimension of S ? 


31 ‘Find a basis for the space of functions that satisfy 


(a) ®@-2y=0 (b) #-#=0. 


dz z 


32 Suppose y1, y2,y3 are three different functions of x. The space they span could 
have dimension 1, 2, or 3. Give an example of y1, y2, y3 to show each possibility. 


33 Find a basis for the space S of vectors (a, b, c,d) with a + c+ d = 0 and also for the 
space T with a + b = 0 and c = 2d. What is the dimension of the intersection SN T? 


34 Which of the following are bases for R?? 
(a) (1, 2,0) and (0,1, —1) 
(b) (hele 1) 2354 Aa, be 1) 00 dad) 
(c) (1, 2,2), (—1, 2,1), (0, 8,0) 
(d) (1,2, 2), (—1, 2,1), (0, 8,6) 


35 Suppose A is 5 by 4 with rank 4. Show that Av = b has no solution when the 5 by 5 
matrix [A 6] is invertible. Show that Av = bis solvable when [A 6] is singular. 


36 = (a) Find a basis for all solutions to d*+y/dz* = y(z). 


(b) Find a particular solution to d+y/dax* = y(x) + 1. Find the complete solution. 
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37 


38 


39 


40 


Challenge Problems 


Write the 3 by 3 identity matrix as a combination of the other five permutation 
matrices! Then show that those five matrices are linearly independent. (Assume a 
combination gives c; P; + --- + cs Ps = zero matrix, and prove that each c; = 0.) 


Intersections and sums have dim(V) + dim(W) = dim(V M W) + dim(V + W). 
Start with a basis u,,...,u, for the intersection VM W. Extend with v1,..., us to 
a basis for V, and separately with w1,...,w; to a basis for W. Prove that the w’s, 
v’s and w’s together are independent. The dimensions have (r + s) + (r+t) = 
(r) + (r+ s +t) as desired. 


Inside R”, suppose dimension (V) + dimension (W) > n. Why is some nonzero 
vector in both V and W? Start with bases v1,...,vp and wi,...,Wq,p+q>n. 


Suppose A is 10 by 10 and A? = 0 (zero matrix): A times each column of A is O. 
This means that the column space of A is contained in the . If A has rank r, 
those subspaces have dimension r < 10 — r. So the rank of Ais r < 5, if A? = 0. 
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5.5 The Four Fundamental Subspaces 


The figure on this page is the big picture of linear algebra. The Four Fundamental 
Subspaces are in position: Two orthogonal subspaces in R” and two in R™. For any 6 
in the column space, the complete solution to Av = b has one particular solution v, in the 
row space, plus any v,, in the nullspace. 


column 
space 
C(A) 


Figure 5.5: The Four Fundamental Subspaces. The complete solution vp, + vy to Av = b. 


The main theorem in this chapter connects rank and dimension. The rank of a matrix 
is the number of pivots. The dimension of a subspace is the number of vectors in a basis. 
We count pivots or we count basis vectors. The rank of A reveals the dimensions of 
all four fundamental subspaces. Here are the subspaces, including the new one. 

Two subspaces come directly from A, and the other two come from A™ : 


Four Fundamental Subspaces Dimensions 
1. The row space C(A*) Subspace of R”. Tr 
2. The column space C(A) Subspace of R™. Tr 
3. The nullspace N(A) Subspace of R”. n—rTr 


4. The left nullspace N(A‘) Subspace of R™. This is ournew space. m—r 


In this book the column space and nullspace came first. We know C'(A) and N(A) pretty 
well. Now the other two subspaces come forward. The row space contains all combinations 
of the rows. This is the column space of A’. 
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For the left nullspace we solve AT y = 0—that system is n by m. This is the nullspace 
N(A‘). The vectors y go on the left side of A when we transpose to get y'A = O7. The 
matrices A and A? are usually different. So are their column spaces and their nullspaces. 
But those spaces are connected in an absolutely beautiful way. 

Part 1 of the Fundamental Theorem finds the dimensions of the four subspaces. One 
fact stands out: The row space and column space have the same dimension r. This is 
the rank of the matrix. The other important fact involves the two nullspaces: 


N(A) and N(A?*) have dimensions n — r and m — 1, to make up the full n and m. 


Part 2 of the Fundamental Theorem will describe how the four subspaces fit together 
(two in R” and two in R™). That completes the “right way” to understand every Av = b. 
Stay with it—you are doing real mathematics. 


The Four Subspaces for R 


Suppose A is reduced to its row echelon form R. For that special form, the four subspaces 
are easy to identify. We will find a basis for each subspace and check its dimension. Then 
we watch how the subspaces change (two of them don’t change) as we look back at A. 
The main point will be that the four dimensions are the same for A and R. 

As a specific 3 by 5 example, look at the four subspaces for this echelon matrix R: 


m=3 Te 33 300 0-6 pivot rows 1 and 2 
n=65 010° 0 1 2 
a Oo.08 HO" 50% 5G pivot columns 1 and 4 


The rank of this matrix R is r = 2 (two pivots). Take the four subspaces in order. 


1. The row space of # has dimension 2, matching the rank. 


Reason: The first two rows are a basis. The row space contains combinations of all three 
rows, but the third row (the zero row) adds nothing new. So rows 1 and 2 span the row space. 
C(R"). 

The pivot rows 1 and 2 are independent. That is obvious for this example, and it is always 
true. If we look only at the pivot columns, we see the r by r identity matrix. 
There is no way to combine its rows to give the zero row (except by the combination with all 
coefficients zero). So the r pivot rows are a basis for the row space. 


The dimension of the row space is the rank r. The nonzero rows of R forma basis. 


2. The column space of R also has dimension r = 2, matching the rank. 


Reason: The pivot columns 1 and 4 form a basis for C(R). They are independent because 
they start with the r by r identity matrix. No combination of those pivot columns can give 
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the zero column (except the combination with all coefficients zero). And they also span 
the column space. Every other (free) column is a combination of the pivot columns. 


The combinations we need are revealed by the three special solutions : 


Column 2 is 3 times column 1. The special solution is (—3, 1, 0, 0, 0). 
Column 3 is 5 times column 1. The special solution is (—5, 0,1, 0,0, ). 


Column 5 is 7 (column 1) + 2 (column 4). That solution is (—7,0,0,—2,1). 
The pivot columns are independent, and they span C'(R), so they are a basis for C(R). 
The dimension of the column space is the rank r. The pivot columns form a basis. 
3. The nullspace has dimension n — r = 5 — 2. There aren — r = 3 free variables. 


V2, V3, Us are free (no pivots in those columns). They yield the three special solutions s2, 
83, 85 to Rv = 0, Set a free variable to 1, and solve for the pivot variables v1 and v4. 


—3 —5 —7 
1 0 0 Rv = Ohas the 
82 = 0 83> 1 85 = 0 complete solution 
0 0 —2 Uv = 0282 + 1383 + U5 85 
0 0 1 


There is a special solution for each free variable. With n variables and r pivot variables, that 
leaves n — r free variables and special solutions. N(R) has dimension n — r. 


The nullspace has dimension n — r. The special solutions form a basis. 


The special solutions are independent, because they contain the identity matrix in 
rows 2, 3, 5. All solutions are combinations of special solutions, v = v2s2 + 1383 + U585, 
because this puts v2, vg and vs in the correct positions. Then the pivot variables v1 
and v4 are totally determined by the equations Rv = 0. 


4. The nullspace of R™ (the left nullspace of R) has dimension m — r = 3 — 2. 


Reason: The equation R'y = 0 looks for combinations of the columns of R™ (the rows of 
R) that produce zero. You see why y; and y2 must be zero, and y3 is free. 


yii[ 13.35. 54 0, 7] 
+y2[0, 0, 0, 1, 2] 
+y3 [0, 0, 0, 0, 0} D) 
Left nullspace [0 0 ys]JR=[0, 0, 0, 0, 0] 
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C(A‘) pivot C(A) 
dim r/ rows dimr 


column space 
all Av 


pivot 
columns 


row space 
all ATy 


The big picture 
R” 
special 
solutions 


left nullspace 
Aty=0 
N(A™) 

dimension m — r 


nullspace 
Av=0 


last rows 
of E: EA=R 


N(A) 
dimension n — r 


Figure 5.6: Bases and dimensions of the Four Fundamental Subspaces. 


In all cases R ends with m — r zero rows. Every combination of these m — r rows 
gives zero. These are the only combinations of the rows of R that give zero, because the 
r pivot rows are linearly independent. The left nullspace of R contains all these solutions 


y = (0,..-,0,Y+1)+++)Ym) to RTy = 0. 
If Ais m by n of rank 1, its left nullspace has dimension m — r. 


This subspace came fourth, and it completes the picture of linear algebra. 


In R” the row space and nullspace have dimensions r and n — r (adding to n). 
In R”™ the column space and left nullspace have dimensions r and m — r (total m). 


So far this is proved for echelon matrices R. Figure 5.6 shows the same for A. 


The Four Subspaces for A 


We have a job still to do. The subspace dimensions for A are the same as for R. 
The job is to explain why. A is now any matrix that reduces to R = rref(A). 


by 39 0 i0Got 
This A reduces to R A=|0 0 0 1 2 Notice C(A) #C(R) (2) 
Pe 238 25g 
An elimination matrix takes A to R. The big picture (Figure 5.6) applies to both. 


The invertible matrix F is the product of the elementary matrices that reduce A to R: 


AtoRandback EA=R and A=E'!R (3) 
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1 A has the same row space as R. Same dimension r and same basis. 


Reason: Every row of A is a combination of the rows of R. Also every row of R is a 
combination of the rows of A. Elimination changes rows, but not row spaces. 

Since A has the same row space as R, we can choose the first r rows of R as a basis. 
The first r rows of A could be dependent. The good r rows of A end up as pivot rows. 


2 The column space of A has dimension r. The r pivot columns of A are a basis. 
The number of independent columns equals the number of independent rows. 


Wrong reason: “A and R have the same column space.” This is false. The columns of R 
often end in zeros. The columns of A don’t often end in zeros. The column spaces can be 
different ! But their dimensions are the same—both equal to r. 


Right reason: The same combinations of the columns are zero (or nonzero) for A and R. 
Say that another way: Av = 0 exactly when Rv = 0. Pivot columns are independent. 


We have just given one proof of the first great theorem of linear algebra: Row rank equals 
column rank. This was easy for R, and the ranks are the same for A. The Chapter 5 Notes 
propose three direct proofs not using R. 


3 A has the same nullspace as R. Same dimension n — r and same basis. 


Reason: The elimination steps don’t change the solutions. The special solutions are a 
basis for this nullspace (as we always knew). There are n — r free variables, so the 
dimension of the nullspace is n — r. Notice that r + (n — r) equals n: 


That beautiful fact is the Counting Theorem. Now apply it also to A’. 
4 The left nullspace of A (the nullspace of AT) has dimension m — r. 


Reason: A” is just as good a matrix as A. When we know the dimensions for every A, 
we also know them for A’. Its column space was proved to have dimension r. Since AT 
is n by m, the “whole space” is now R”™. The counting rule for A was r+ (n— rr) =n. 
The counting rule for AT is r + (m —r) = m. We have all details of the main theorem: 


Fundamental Theorem of Linear Algebra, Part 1 


The column space and row space both have dimension r. 


The nullspaces have dimensions n — r and m — r. 


By concentrating on spaces of vectors, not on individual numbers or vectors, we get these 
clean rules. You will soon take them for granted. But for an 11 by 17 matrix with 187 
nonzero entries, I don’t think most people would see why these facts are true: 


dimension of C(A) = dimension of C(A‘™) = rank of A 


yon dimension of CA) + dimension of N(A) = 17. 
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Example1 A=[1 2 3] has m=1 and n=3 andrank r=1. 


The row space is a line in R®. The nullspace is the plane Av = x + 2y + 3z = 0. 
This plane has dimension 2 (which is 3 — 1). The dimensions add to 1 + 2 = 3. 

The columns of this 1 by 3 matrix are in R!. The column space is all of R'. The left 
nullspace contains only the zero vector. The only solution to ATy = 0 is y = 0, no other 
multiple of [1 2 3] gives the zero row. Thus N(A7‘) is Z, the zero space with dimension 
0 (which is m — r). In R™ the dimensions add to 1 + 0 = 1. 


eae 


Example 2 oe Aig 


| has m= 2 and n=3 andrank r= 1. 


The row space is the same line through (1, 2,3). The nullspace must be the same plane 
x + 2y + 3z = 0. The dimensions of those two spaces still add ton: 1+ 2 = 3. 

All columns are multiples of the first column (1,2). Twice the first row minus the sec- 
ond row is the zero row. Therefore ATy = 0 has the solution y = (2,—1). The column 
space and left nullspace are perpendicular lines in R?. Dimensions add tom: 1+ 1 = 2. 


Column space = line through | Left nullspace = line through E : 


If A has three equal rows, its rank is . What are two of the y’s in its left nullspace? 


The y’s in the left nullspace combine with the rows to give the zero row. 


Matrices of Rank One 


Those examples had rank r = 1—and rank one matrices are special. We can describe them 
all. You will see again that dimension of row space = dimension of column space. When 


r = 1, every row is a multiple of the same row r?: 
1 2 3 1 
A= cr? AS ee is c= finies: j[)) 2g c= 
=8: 6. 29 3 : 
Of *O0R © 20: 0 


A column times a row (4 by | times | by 3) produces a matrix (4 by 3). All rows are 
multiples of the row rt = (1,2,3). All columns are multiples of the first column 
c = (1,2, —3,0). The row space is a line in R”, and the column space is a line in R™. 


T 


Every rank one matrix has the special form A = cr~ = column times row. 


All columns are multiples of c. All rows are multiples of r?. The nullspace is the 
plane perpendicular to r. (Av = O means that c(rtv) = 0 and then r™v = 0.) This 
perpendicularity of the subspaces will become Part 2 of the Fundamental Theorem. 


A column vector c times a row vector r? is often called an outer product. 
The inner product rc is a number, the outer product cr? is a matrix. 
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Perpendicular Subspaces 


Look at the equation Av = 0. This says that v is in the nullspace of A. It also says that 
v is perpendicular to every row of A. The first row multiplies v to give the first zero 
in Av =0: 


row 1 0 111 1 0 
Av = tee eo) = ss 3 1 ‘Q —3]} =1|0 
row m 0 0: 2 3 2 0 


The vector v = (1, —3, 2) in the nullspace is perpendicular to the first row (1,1,1). Their 
dot product is 1 — 3 + 2 = 0. That vector v is also perpendicular to the rows (3, 1,0) and 
(0, 2,3)—because of the zeros on the right hand side. The dot product of every row and 
every U is zero. 

Every v in the nullspace is perpendicular to the whole row space. It is perpendicular 
to each row and it is perpendicular to all combinations of rows. We have found new words 
to describe the nullspace of A: 


N(A) contains all vectors v that a perpendicular to the row space of A. 


These two fundamental subspaces N(A) and R(AT) now have a position in space. They 
are “orthogonal subspaces” like the xy plane and the z axis in R®. Tilt that picture and 
you still have orthogonal subspaces. Their dimensions 2 and 1 still add to 3: the dimension 
of the whole space. For any matrix, the r-dimensional row space is perpendicular to the 
(n — r)-dimensional nullspace. If that matrix is AT instead of A, we have subspaces of R™. 


(In R") All solutions to Av = 0 are perpendicular to all rows of A. 
(In R™) Allsolutions to ATy = 0 are perpendicular to all columns of A. 


If A is square and invertible, the two nullspaces are just Z: only the zero vector. The row 
and column spaces are the whole space. These are the extreme in perpendicular subspaces : 
everything and nothing. No, not nothing, the zero vector is perpendicular to everything. 

Let me draw the big picture using this new insight of perpendicular subspaces. 
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This perpendicularity is Part 2 of the Fundamental Theorem of Linear Algebra. We use 
anew symbol S'+ (called S perp) for all vectors that are orthogonal to the subspace S. 


Fundamental Theorem, Part 2 : N(A) = C(A‘)* and N(At) =C(A)+. 
We know we have all perpendicular vectors (not just some of them, like 2 lines in space). 
The dimensions r and n — r add to the full dimension n. For a line and plane in R®: 


(Line in space)+ = (Plane in space) and 1 + 2 = 3. 
Here is Problem 37 in the problem set : Explain why (S*)+ = S. 


= REVIEW OF THE KEYIDEAS #8 


1. The r pivot rows of R are a basis for the row spaces of R and A (same space). 

2. The r pivot columns of A (not &) are a basis for its column space C(A). 

3. The n — r special solutions are a basis for the nullspaces of A and R (same space). 
4. The last m — r rows of J are a basis for the left nullspace of R. 

5. The last m — r rows of F are a basis for the left nullspace of A, if HA = R. 


6. R(A7) is perpendicular to N(A). And C(A) is perpendicular to N(A7). 


= WORKEDEXAMPLES #8 


5.5 A __ Find bases and dimensions for all four fundamental subspaces if you know that 
Oe 20 Le 3)-02 29 
A ae he (> aS Rs ih Ve ee he 
oe Oued 0. 80"0 <0 
By changing only one number in R, change the dimensions of all four subspaces. 


Solution —_ This matrix has pivots in columns 1 and 3. Its rank is r = 2. 


Row space Basis (1,3, 0,5) and (0,0, 1,6) from R. Dimension 2. 
Column space Basis (1, 2,5) and (0, 1,0) from E~! (and A). Dimension 2. 
Nullspace Basis (—3, 1, 0,0) and (—5,0, —6, 1) from R. Dimension 2. 
Nullspace of AT Basis (—5, 0, 1) from row 3 of FE. Dimension 3 — 2 = 1. 
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We need to comment on that left nullspace N(AT). EA = R says that the last row of E 
combines the three rows of A into the zero row of R. So that last row of E is a basis vector 
for the left nullspace. If R had two zero rows, then the last two rows of E would be a basis. 
(Just like elimination, y? A = 07 combines rows of A to give zero rows in R.) 

To change all these dimensions we need to change the rank r. The way to do that is to 
change the zero row of R. The best entry to change is R34 in the corner. 


5.5 B_ How can you put four 1’s into a 5 by 6 matrix of zeros, so that its row space 
has dimension 1? Describe all the ways to make its column space have dimension 1. 
Describe all the ways to make the dimension of its nullspace N(A) as small as possible. 
How would you make the sum of the dimensions of all four subspaces small ? 


Solution Therankis 1 if the four 1’s go into the same row, or into the same column. They 
can also go into two rows and two columns (so aii aij Aji aj; 1). 
Since the column space and row space always have the same dimension, this answers the 
first two questions: The smallest dimension is 1. 

The nullspace has its smallest possible dimension 6 — 4 = 2 when the rank is r = 4. 
To achieve rank 4, the 1’s must go into four different rows and columns. 

You can’t do anything about the sum r+ (n —r) +r+(m—r) =n+mMm. It will be 
6 + 5 = 11 no matter how the 1’s are placed. The sum is 11 even if there aren’t any 1’s... 


If all the other entries of A are 2’s instead of 0’s, how do these answers change ? 


Problem Set 5.5 


1 (a) If a 7 by 9 matrix has rank 5, what are the dimensions of the four subspaces ? 
What is the sum of all four dimensions? 


(b) If a3 by 4 matrix has rank 3, what are its column space and left nullspace? 


2 Find bases and dimensions for the four subspaces associated with A and B: 
1 2) <4 Lee 224 
A=[o a8. and Spel 
3 Find a basis for each of the four subspaces associated with A: 
Oe ale 22 Sit Le20:-'0) Ouhe 203-4 
Al cQr a, Qe" 4s 16") [yet |) 1 0 Ox 090% 2 
070k 09 M132 Op heel 0540) 300° 0 


4 Construct a matrix with the required property or explain why this is impossible: 
: 1] Jo : 
(a) Column space contains 1 : [8]. row space contains | 4], [2]. 


(b) Column space has basis [| , nullspace has basis H : 
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10 


11 


12 


13 


(c) Dimension of nullspace = 1 + dimension of left nullspace. 
(d) Left nullspace contains | 3], row space contains | $]. 


(e) Row space = column space, nullspace # left nullspace. 


If V is the subspace spanned by (1,1,1) and (2,1,0), find a matrix A that has 
V as its row space. Find a matrix B that has V as its nullspace. 


Without elimination, find dimensions and bases for the four subspaces for 


Suppose the 3 by 3 matrix A is invertible. Write down bases for the four subspaces for 
A, and also for the 3 by 6 matrix B=[A Al]. 


What are the dimensions of the four subspaces for A, B, and C, if I is the 3 by 3 
identity matrix and 0 is the 3 by 2 zero matrix? 


Le al) 
As e-o) and B=| gr ar | and C= S16), 


Which subspaces are the same for these matrices of different sizes? 


(@ [A] and | 4 | (b) la) [4 4] 


Prove that all three of those matrices have the same rank r. 


If the entries of a 3 by 3 matrix are chosen randomly between 0 and 1, what are the 
most likely dimensions of the four subspaces ? What if the matrix is 3 by 5? 


(Important) A is an m by n matrix of rank r. Suppose there are right sides b for which 
Av = bhas no solution. 


(a) What are all inequalities (< or <) that must be true between m,n, and r? 


(b) How do you know that ATy = 0 has solutions other than y = 0? 


Construct a matrix with (1,0, 1) and (1, 2, 0) as a basis for its row space and its column 
space. Why can’t this be a basis for the row space and nullspace? 


True or false (with a reason or a counterexample): 


(a) If m = n then the row space of A equals the column space. 
(b) The matrices A and —A share the same four subspaces. 


(c) If A and B share the same four subspaces then A is a multiple of B. 
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Without computing A, find bases for its four fundamental subspaces: 
1 0 0 lee 223 24 
A= | (62 4 <0) ON ALS 2.523 
Oe Siow: 0. <0) tke 2 


If you exchange the first two rows of A, which of the four subspaces stay the same ? 
If v = (1, 2,3, 4) is in the left nullspace of A, write down a vector in the left nullspace 
of the new matrix. 


Explain why v = (1,0, —1) cannot be a row of A and also in the nullspace. 


Describe the four subspaces of R® associated with 


Q Lo Lf 0 
a= [0 0: and J+A=/0 1 1 
0 0 0 0 0 1 


(Left nullspace) Add the extra column b and reduce A to echelon form: 


LOD SEO 1 2 3 by 
PAB) S 405 Cyt) ae hg 3 6. ppendiy 
7 8 9 bg 0 0 0 bg — 2bo + by 


A combination of the rows of A has produced the zero row. What combination is it? 
(Look at b3 — 2b2 + b; on the right side.) Which vectors are in the nullspace of AT 
and which vectors are in the nullspace of A? 


Following the method of Problem 18, reduce A to echelon form and look at the zero 
rows. The b column tells which combinations you have taken of the rows: 


12 db ; ; a 

(a) | 3 4 be (b) 2 
4 6 b: Pee 

~ "8 2» Be wba 


From the b column after elimination, read off m — r basis vectors in the left nullspace. 
Those y’s are combinations of rows that give zero rows. 


(a) Find the solutions to Av = 0. Check that v is are perpendicular to the rows: 


1 0 O m2 0% Al 
A=] 2-1 O i Od 3 | =2R 
3.4 #1 00 0 0 


(b) How many independent solutions to ATy = 0? Why is y" the last row of E71? 
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21 


22 


23 


24 


25 


26 


27 


28 


Suppose A is the sum of two matrices of rank one: A = uv! + w2?. 


(a) Which vectors span the column space of A? 

(b) Which vectors span the row space of A? 

(c) The rank is less than2if sor if 

(d) Compute A and its rank if u = z = (1,0,0) and v = w = (0,0, 1). 


Construct A = uv? + wz? whose column space has basis (1,2, 4), (2,2,1) and 
whose row space has basis (1, 0), (1, 1). Write A as (3 by 2) times (2 by 2). 


Without multiplying matrices, find bases for the row and column spaces of A: 


ay 
ee eee 
Qh 


How do you know from these shapes that A = (3 by 2) (2 by 3) cannot be invertible ? 


(Important) ATy = d is solvable when d is in which of the four subspaces? The 
solution y is unique when the contains only the zero vector. 


True or false (with a reason or a counterexample): 


(a) A and A™ have the same number of pivots. 
(b) A and A™ have the same left nullspace. 
(c) If the row space equals the column space then AT = A, 
(d) If AT = —A then the row space of A equals the column space of A. 
(Rank of AB < ranks of A and B) If AB = C, the rows of C' are combinations 


of the rows of . So the rank of C is not greater than the rank of ___. Since 
B™ AT =’, the rank of C is also not greater than the rank of 


If a, b,c are given with a # 0, how would you choose d so that ie A has rank 1? 
Find a basis for the row space and nullspace. Show they are perpendicular! 


Find the ranks of the 8 by 8 checkerboard matrix B and the chess matrix C: 


Le OL Olt Oe tly 20 Te th Oo ge BRE A> ie 

OFT AO Ss EOL OD al DY PU spe <p. Sp Pep eep 
Be Poe LS GOh Lee “le 0 and C= four zero rows 

; a Soe ae eae PPP PP Pp p Pp 

Ove Or Bley Ocal AO. pal: TS 1D: ug he cbs tier. 


The numbers 7,7, b,q,k,p are all different. Find bases for the row space and the 
left nullspace of B and C’. Challenge problem: Find a basis for the nullspace of C. 
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Can tic-tac-toe be completed (5 ones and 4 zeros in A) so that rank (A) = 2 but neither 
side passed up a winning move? 


Problems 30-33 are about perpendicularity of the fundamental subspaces (two per- 
pendicular pairs.) 


30 


31 
32 


33 


34 


35 


36 


37 


38 


The floor and a wall of your room are not perpendicular subspaces in R°. Why not? 
I am extending the floor and wall to be planes in R°. 


Explain why every y in N(A‘) is perpendicular to every column of A. 


Suppose P is the plane of vectors R* satisfying v; + v2 + v3 + v4 = 0. Find a basis 
for P~. Find a matrix A with N(A) = P. 


Why can’t A have (1, 4,5) in its row space and (4, 5, 1) in its nullspace ? 
Challenge Problems 


If A = uv" is a 2 by 2 matrix of rank 1, redraw Figure 5.6 to show clearly the 
Four Fundamental Subspaces in terms of wu and v. If another matrix B produces those 
same four subspaces, what is the exact relation of B to A? 


M is the 9-dimensional space of 3 by 3 matrices. Multiply every matrix X by A: 


0 -1 1 0 
A=| -l 1 0 |. Notice: AJ] 1 | =] 0 
—1 1 | 0 


(a) Which matrices X lead to AX = zero matrix? 
(b) Which matrices have the form AX for some matrix X? 
(a) finds the “nullspace” of that operation AX and (b) finds the “column space”. What 


are the dimensions of those two subspaces of 14? Why do the dimensions add to 
(n-—r)+r=9? 


Suppose the m by n matrices A and B lead to the same four subspaces. If both 
matrices are already in row reduced echelon form, prove that F' must equal G: 


tla) P=[5 6] 


For any subspace S of R”, why is (S+)4+ = S? “If S+ contains all vectors perpen- 
dicular to S, then S contains all vectors perpendicular to S+ .” Dimensions add to n. 


If AT Av = 0 then Av = 0. Reason: This Av is in the nullspace of AT. Every Av is 
in the column space of A (why ?). Those spaces are perpendicular, and only Av = 0 
can be perpendicular to itself. So ATA has the same nullspace as A. 
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5.6 Graphs and Networks 


Over the years I have seen one model so often, and I found it so basic and useful, that I 
always put it first. The model consists of nodes connected by edges. This is called a graph. 

Graphs of the usual kind display functions f(x). Graphs of this node-edge kind lead to 
matrices. This section is about the incidence matrix of a graph—which tells how the n nodes 
are connected by the m edges. Normally m > n, there are more edges than nodes. 


Every entry of an incidence matrix is 0 or 1 or —1. This continues to hold during elim- 
ination. All pivots and multipliers are +1. Then the echelon matrix R after elimination 
also contains 0,1,—1. So do the special solutions! All four subspaces have basis vectors 
with these exceptionally simple components. The matrices are not concocted for a textbook, 
they come from a model that is absolutely essential in pure and applied mathematics. 


For these incidence matrices, the four fundamental subspaces have meaning and impor- 
tance. Up to now, I have created small matrix examples to show the column space and 
nullspace. I was claiming that all four subspaces need to be understood, but you wouldn’t 
know their importance from such small examples. Now comes the chance to learn about the 
most valuable models in discrete mathematics—graphs and their matrices. 


Graphs and Incidence Matrices 


Figure 5.7 displays a graph with m = 6 edges and n = 4 nodes. Its incidence matrix 
will be 6 by 4. This matrix A tells which nodes are connected by which edges. The 
entries —1 and +1 also tell the direction of each arrow. The first row —1,1,0,0 of A 
(the incidence matrix) shows that the first edge goes from node | to node 2. 


node 

O@O@O® 

-1 1 0 0 1 
= Te O ek & <0 2 

ce 0---E .°50 3 edge 

-1 0 0 1 4 
O8=1, 20" ed 5 
0 o-!1 1 6 


Figure 5.7: Complete graph with m = 6 edges and n = 4 nodes. Edge 1 gives row 1. 


Row numbers in A are edge numbers on the graph. Column numbers are node numbers. 
This particular graph is complete—every pair of nodes is connected by an edge. You can 
write down A immediately by looking at the graph. The graph and the matrix have the same 
information. 

If edge 6 is removed from the graph, row 6 is removed from the matrix. The constant 
vector (1,1,1,1) is still in the nullspace of A. Our goal is to understand all four of the 
fundamental subspaces coming from A. 
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The Nullspace and Row Space 


For the nullspace we solve Av = 0. By writing down those m equations we see that 
Aisa difference matrix : 


-1 1 00 v2 — U1 

sal OAs Ar oO UI U3 — Vy 

Hi 0 —1 10) U2 Ms i) a 
ee) | 1) 

0 -1 QO 1 U4 4-02 

0 oO-1 1 V4 — U3 


The numbers v1, v2, V3, U4 can represent voltages at the nodes. Then Av gives the voltage 
differences across the six edges. It is these differences that make currents flow. 

The nullspace contains the solutions to Av = O. All six voltage differences are zero. 
This means: All four voltages are equal. Every v in the nullspace is a constant vector 
v = (c,c,c,c). The nullspace of A is a line in R”. Its dimension ism —r = 1, sor = 3. 


We can raise or lower all voltages by the same c, without changing the voltage 
differences. There is an “arbitrary constant’ in v. For functions, we can raise or lower 
f(x) by any constant amount C, without changing its derivative. 

Calculus adds an arbitrary constant “+C” to indefinite integrals. Graph theory adds 
(c,c,c,c) to the voltages. Linear algebra adds any vector v,, in the nullspace to one 
particular solution of Av = b. 

The row space of A is also a subspace of R*. Every row adds to zero, because —1 
cancels +1 in each row. Then every combination of the rows also adds to zero. This is just 
saying that v = (c,c,c, c) in the nullspace is orthogonal to every vector in the row space. 

For any connected graph with n nodes, the situation is the same. The vectors v = 
(c,...,¢) fill the nullspace in R”. All rows are orthogonal to v ; their components add to 
zero. The row space C(A™) has dimension n — 1. This is the rank of A. 


The Column Space and Left Nullspace 


The column space contains all combinations of the four columns. We expect three inde- 
pendent columns, since the rank is r = n — 1 = 3. The first three columns are independent 
(so are any three). But the four columns add to the zero vector, which says again that 
(1,1,1,1) is in the nullspace. How can we tell if a particular vector b is in the column 
space of an incidence matrix ? 


First answer Apply elimination to Av = b. On the left side, some combinations of rows 
will give zero rows. Then the same combination of b’s on the right side must be zero ! 
Here is the first combination that elimination will discover: 


Row 1 — Row 2 + Row 3 = Zero row. The right side b needs b} — bg +63 =0. = (2) 
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Since A has m = 6 rows and its rank is r = 3, elimination leads to 6 — 3 zero rows 
in the reduced matrix R. There will be three tests for the vector b to lie in the column space. 
Elimination will lead to three conditions on b for Av = b to be solvable. 

I want to find those conditions in a better way. The graph has three small loops. 


Second answer using loops Av contains differences in v’s. If we add differences 
around a closed loop in the graph, the cancellation leaves zero. Around the big triangle 
formed by edges 1, 3, —2 (the arrow goes backward on edge 2) the differences cancel out: 


Around a loop (vo — v1) + (vg — ve) — (vg — v1) = 0. 


The components of Av add to zero around every loop. When 6 is in the column space 
of A, then Av = b. The vector 6 must obey the voltage law : 


By testing all the loops, we decide whether b is in the column space. Av = b can be 
solved exactly when the components of b satisfy all the same dependencies as the rows of A. 
Then KVL is satisfied, elimination leads to 0 = 0, and Av = bis consistent. 


Question Ican see four loops in the graph, three small and one large. We are only expecting 
three tests, not four, for b to be in C(A). What is the explanation ? 


Answer Those four loops are not independent. If you combine the small loops in 
Figure 5.8, you get the large loop. So the tests from the small loops combine to give the 
test from the large loop. We only have to test KVL on the small loops. 


We have described the column space of A in two ways. First, C’(A) contains all com- 
binations of the columns (and n — 1 columns are enough , the nth column is dependent). 
Second, CA) contains all vectors b that satisfy the Voltage Law. Around every loop the 
components of b add to zero. We will now see that this is requiring b to be orthogonal 
to every vector y in the nullspace of AT. C(A) is orthogonal to the left nullspace N(A"). 


Voltage laws 


Loop A b, —b4+b5s =O 
Loop B b4 —bg — bg =O 
Loop C b3 +bg —b5 =0 


Big loop A+ B+ C: by — bo + b3 = 0 


Figure 5.8: Loops reveal the column space of A and the nullspace of A™ and the tests on b. 


316 Chapter 5. Vector Spaces and Subspaces 


N (A’) contains all solutions to ATy = 0. Its dimension is m — r = 6 — 3: three y’s. 


(3) 


oO 
| 
— 
oO 
| 
— 
< 
w 
oococo 


The true number of equations is r = 3 and not n = 4. Reason: The four equations add to 
0 = 0. The fourth equation follows automatically from the first three. 


What do the equations mean? The first equation says that —y; — yo — ys = 0. 
The net flow into node 1 is zero. The fourth equation says that ys + ys + ye = 0. 
Flow into the node minus flow out is zero. These equations are famous and fundamental : 


Kirchhoff’s Current Law ATy=0 Flow in equals flow out at each node. 


This law deserves first place among the equations of applied mathematics. It expresses 
“conservation” and “continuity” and “balance.” Nothing is lost, nothing is gained. When 
currents or forces are balanced, the equation to solve is Aly = 0. Notice the beautiful 
fact that the matrix in this balance equation is the transpose of the incidence matrix A. 


What are the actual solutions to A'y = 0? The currents must balance themselves. 
The easiest way is to flow around a loop. If a unit of current goes around the big triangle 
(forward on edge 1, forward on 3, backward on 2), the vector is y = (1, —1,1,0,0, 0). This 
satisfies AT y = 0. Every loop current is a solution to Kirchhoff’s Current Law. 


Around the loop, flow in equals flow out at every node. The smaller loop A goes forward 
on edge 1, forward on 5, back on 4. Then y = (1,0,0,—1,1,0) will have ATy = 0. 
Each loop in the graph gives a vector y in N(A‘). 


We expect three independent y’s, since 6 — 3 = 3. The three small loops in the graph 
are independent. The big triangle seems to give a fourth y, but it is the sum of flows around 
the small loops. The small loops A, B,C give a basis y,, yz, Y3 for the nullspace of AT. 


1 0 0 1 
Solutions to ATy = 0 0 0 -1 -1 
Big loop i 0 1 O27 | 1 
from three Mire Caan 5) wag Ye ae We | co 
small loops 1 —1 0 0 
0 1 —1 0 


a 
wy 
Q 
aN 
ae 

& 
ae 

Q 
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Summary The m by n incidence matrix A comes from a connected graph with n nodes 
and m edges. The row space and column space have dimension r = n — 1 = rank of A. 
The nullspaces of A and AT have dimension 1 andm —r=m—n+1: 


1 The constant vectors (c, c,...,c) make up the nullspace N(A). 
2 There are r = n — | independent rows, from n — 1 edges with no loops (a tree). 
3 Voltage law gives C(A): The components of Av add to zero around every loop. 


4 Current law Ay = 0: N(A‘) from currents on m — r independent loops. 


For every graph in a plane, linear algebra yields Euler’s formula : 


(number of nodes) — (number of edges) + (number of small loops) = 1. 


This is (n) — (m) + (m —n+1) = 1. The graph in our example has 4 — 6 +3 = 1. 
A single triangle has (3 nodes) — (3 edges) + (1 loop). On a 10-node tree with 9 edges 
and no loops, Euler’s count is 10 — 9 + 0 = 1. All planar graphs lead to the answer 1. 


Trees 


A tree is a graph with no loops. Figure 5.9 shows two trees with n = 4 nodes. These 
graphs (and all our graphs) are connected: Between every two nodes there is a path of edges, 
so the graph doesn’t break into separate pieces. The tree must have m = n — 1 edges, 
to connect all n nodes. The rank of the incidence matrix is also r = n — 1. Then the 
number of loops in a tree is confirmed as m — r = 0 (no loops). 


=) 41s OP 30) 
Aga |=) 0" A, <0 1 Tree 2 
0-1 0 1 
5 
2) Tree 1 ® @ 3 ©) 


Figure 5.9: Two trees with n = 4 nodes and m = 3 edges. The rank of A; is r = m. 


The incidence matrix A of a tree has independent rows. In fact the three rows of A, are 
three independent rows 1, 2, 5 of the previous 6 by 4 matrix (for the complete graph). 
That original graph contains 16 different trees. 
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The Adjacency Matrix and the Graph Laplacian 


The adjacency matrix W is square. With n nodes in the graph, this matrix is n by n. If there 
is an edge from node 7 to node j, then W;; = 1. If no edge, then W;; = 0. Since our edges 
go both ways, W is symmetric. The diagonal entries are zero. 

All information about the graph is in the adjacency matrix W, except the numbering and 
arrow directions of the edges. 

There are m 1’s above the diagonal of W, and also below. Section 7.5 will study the 
graph Laplacian matrix ATA (A is the incidence matrix) and find this formula: 


The diagonal matrix D tells the “degree” of every node. This is the number of edges that 
go in or out of that node. Here are W and A™ A for the complete graph with six edges. 


vag ame Som, =A weal 

: Bat el Rea F Gy aioe 35 eS] 
Adjacency W = a 1 Graph Laplacian A* A = = ee ae oe 
Mes Coa coe aie rt ae 


Every row of ATA adds to zero. The degree 3 on the diagonal cancels the —1’s off 
the diagonal. The vector (1, 1, 1, 1) in the nullspace of A is also in the nullspace of AT A. 


Challenge Reconstruct a graph with arrows from A and a graph without arrows from W. 


| ae | ame ae 0101 
cs 1 , {1010 
BD ie So ed a W=16 101 
ee es Ce 1o2e 


= REVIEW OF THE KEYIDEAS #® 


1. The n nodes and m edges of a graph give n columns and m rows in A. 

2. Each row of the incidence matrix A has —1 and 1 (start and end of that edge). 

3. Voltage Law for C(A) : The components of Av add to zero around any loop. 

4. Current Law for N(A‘™): A‘ y = (flow in) minus (flow out) = zero at every node. 
5. Rank of A = n—1. Then AT y = 0 for the currents y around m — n + 1 small loops. 


6. The adjacency matrix W and the graph Laplacian AT A are symmetric n by n. 


5.6. Graphs and Networks 319 
Problem Set 5.6 


Problems 1-7 and 8-13 are about the incidence matrices for these two graphs. 


@ Oo) oO) 


® edge 3 ©) @ @ 


1 Write down the 3 by 3 incidence matrix A for the triangle graph. The first row has 
—1 in column 1 and +1 in column 2. What vectors (v1, v2, v3) are in its nullspace ? 
How do you know that (1,0, 0) is not in its row space ? 


2 Write down A? for the triangle graph. Find a vector y in its nullspace. The compo- 
nents of y are currents on the edges—how much current is going around the triangle ? 


3 By elimination on A find the echelon matrix R. What tree corresponds to the two 
nonzero rows of R ? 


—vutvg =b1 
Av=b —Uy +030 = bo 
—v2 +03 = b3 
4 Choose a vector (bj, bg, bs) for which Av = b can be solved, and another vector b that 


allows no solution. What are the dot products yb for y = (1,—1, 1)? 


5 Choose a vector (f1, f2, f3) for which ATy = f can be solved, and a vector f 
that allows no solution. How are those f’s related to v = (1,1,1)? The equation 
Aly = f is Kirchhoff’s law. 


6 Multiply matrices to find ATA. Choose a vector f for which ATAv = f can be 
solved, and solve for v. Put those voltages v and currents y = — Av onto the triangle 
graph. The vector f represents “current sources.” 


7 Multiply A A (still for the first graph) and find its nullspace—it should be the same 
as N(A). Which vectors f are in its column space ? 


8 Write down the 5 by 4 incidence matrix A for the square graph with two loops. 
Find one solution to Av = 0 and two solutions to ATy = O. The rank is 


9 Find two requirements on the b’s for the five differences vg — v1, v3 — v1, 03 — V2, 
U4 — U2, V4 — U3 to equal bj, bo, b3, b4, bs. You have found Kirchhoff’s Law 
around the two in the graph. 
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By elimination, reduce A to U. The three nonzero rows give the incidence matrix 
for what graph? You found one tree in the square graph—find the other seven trees. 


Multiply A? A and explain how its entries come from columns of A (and the graph). 
(a) The diagonal of the Laplacian matrix A‘ A counts edges into each node (the 
degree). Why is this the dot product of a column with itself ? 
(b) The off-diagonals —1 or 0 tell which nodes 2 and 7 are connected. Why is —1 


or 0 the dot product of column 7 with another column 7? 


Find the rank and the nullspace of ATA. Why does AT Av = f have a solution only 
i fie hia: fa =O? 


Write down the 4 by 4 adjacency matrix W for the square graph. Its entries 1 or 0 
count paths of length 1 between nodes (those are just edges). 

Important. Compute W? and check that its entries count the paths of length 2 
between nodes. Why does (W?),; = degree of node i? Those paths go out and back. 


A connected graph with 7 nodes and 7 edges has how many loops ? 


For the graph with 4 nodes, 6 edges, and 3 loops, add a new node. If you connect it 
to one old node, Euler’s formula becomes ( _)—( )+(_ ) =1. If you connect it 
to two old nodes, Euler’s formula becomes( )—( )+( )=1. 


Suppose A is a 12 by 9 incidence matrix from a connected (but unknown) graph. 


(a) How many columns of A are independent? 
(b) What condition on f makes it possible to solve ATy = f ? 
(c) The diagonal entries of A? A give the number of edges into each node. What is 


the sum of those diagonal entries ? 


Why does a complete graph with n = 6 nodes have m = 15 edges? A tree that 
connects 6 nodes has only edges and loops. 


How do you know that any n — 1 columns of the incidence matrix A are independent? 
If they were dependent, the nullspace would contain a vector with a zero component. 
But the nullspace of A actually contains 


(a) Find the Laplacian A™ A for a complete graph with n nodes. 


(b) If the edge from node 1 to node 3 is removed, what is the change in ATA? 


Suppose batteries of strength b,, ..., b,, are inserted into the m edges. Then the volt- 
age differences across edges become Av —b. Unit resistances give currents Av — b and 
Kirchhoff’s Current Law is A™(Av — b) = 0. Solve this system for the 


square graph above when b = (1,1,...,1). 
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= CHAPTER 5 NOTES #8 


Vectors are not necessarily column vectors. In the definition of a vector space, 
addition x + y and scalar multiplication cx must obey the following eight rules: 


GQ) e+y=yte 

(2) e+(y+z)=(@+y)+z 

(3) There is a unique “zero vector” such that x + 0 = a for all x 

(4) For each x there is a unique vector —x such that x + (—x) =0 
(5) 1 times x equals x 

(6) (cic2)x@ = c1(c2@) 

(7) cla +y) =catey 

(8) (c1 + c2)@ = cra + cow. 
Here are practice questions to bring out the meaning of those eight rules. 


1. Suppose (x1, 72) + (yi, y2) is defined to be (x1 + y2,x2 + yi). With the usual 
multiplication cx = (cx, cx2), which of the eight conditions are not satisfied ? 


2. Suppose the multiplication ca is defined to produce (cx1,0) instead of (cx, cx2). 
With the usual addition in R?, are the eight conditions satisfied ? 


3. (a) Which rules are broken if we keep only the positive numbers x > 0 in R'? 
Every c must be allowed. The half-line is not a subspace. 


(b) The positive numbers with x + y and ca redefined to equal the usual xy and x° 
do satisfy the eight rules. Test rule 7 when c = 3,2 = 2,y= 1. (Thenz+y = 2 
and cx = 8.) Which number acts as the “zero vector” ? 
4. The matrix A = ie a | is a “vector” in the space M of all 2 by 2 matrices. Write 
down the zero vector in this space, the vector 5A, and the vector — A. What matrices 
are in the smallest subspace containing A? 


5. The functions f(x) = «x? and g(x) = 5z are “vectors in function space.” Which 
rule is broken if multiplying f(z) by c gives f (cx) instead of cf (x) ? Keep the usual 
addition f(x) + g(x). 


6. If the sum of the “vectors” f(x) and g(x) is defined to be the function f(g(z)), 
then the “zero vector” is g(x) = x. Keep the usual scalar multiplication cf (a) and 
find two rules that are broken. 
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Row rank equals column rank: The first big theorem 


The dimension of the row space C(A™) equals the dimension of the column space C(A). 
Here I can outline four proofs (the fourth is neat). Proofs 2,3, 4 do not use elimination. 


Proof 1 Reduce A to R without changing the dimensions of the row and column spaces. 
The row space actually stays the same. The column space changes, going from A to R, 
but its dimension stays the same. The theorem is clear for R: 


rnonzerorowsinR <¢+  r=dimension of row space 
r pivotcolumnsin R ++ r=dimension of column space 


Proof 2  (G. Mackiw, Mathematics Magazine 68 1996). Suppose x), ..., Z, is a basis 
for the row space of A. The next paragraph will show that Aw, ..., Ax, are independent 
vectors in the column space. Then dim (row space) = r < dim (column space). The same 
reasoning applies to A‘, reversing that inequality. So the two dimensions must be equal. 


Suppose cj) Aa, +---+c,Av, = A(c)@1 +++: +¢-x@,) = Av =0. 


Then v is in the nullspace of A and also in the row space (it is a combination of the a’s). 

So v is orthogonal to itself and v = O. All the c’s must be zero since the x’s are a basis. 
This shows that c; Aa, + --- + c-Ax, = O requires that all c; = O. Therefore 

Aa ,...,Ax, are independent vectors in the column space: dimension of C(A) > r. 


Proof3 If A has r independent rows and s independent columns, we can move those rows 
to the top of A and those columns to the left. They meet in an r by s submatrix B: 


ee B C | rrows BC v |_| 0 
BON ee: BD sgeel DE Oo}; |oO}° 
Suppose s > r. Since Bu = 0 has r equations in s unknowns, it has a solution v ¥ O. 


The upper part of the matrix has By + CO = O as shown. The lower rows of A are 
combinations of the upper rows, so they also have Dv + FO = O. But now a combination 


of the first s independent columns | of A, with coefficients from v, is producing zero. 
Conclusion: s > r cannot happen. Thinking similarly for AT, r > s cannot happen. 


Proof 4 Suppose r column vectors w1,...,u, are a basis for the column space C(A). 
Then each column of A is a combination of u’s. Column 1 of A is wyyu, +--- + w,1U,r, 
with some coefficients w. The whole matrix A equals UW = (m by r)(r by n). 


Wi ees Win 
A Py. con “Up : : = UW. 


Wri «++ Wrn 


Now look differently at A = UW. Each row of A is a combination of the r rows of W'! 
Therefore the row space of A has dimension < r. 

This proves that (dimension of row space) < (dimension of column space) for any A. 
Apply this reasoning to A‘, and the two dimensions must be equal. 

To my way of thinking, that is a really cool proof. 
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The Transpose and Row Space of d/dt 


This book is constantly emphasizing the parallels between linear differential equations and 
matrix equations. In both cases we have null solutions and particular solutions. The 
nullspace for a differential equation Dy = 0 contains the null solutions yy, : 


Matrices A Av, =0 Derivatives D = Dyn = yn" + Byn' + Cyn =0 


The nullspace of this D has dimension 2. This is the reason that y needs two initial 


conditions. We look for solutions y, = e%* and usually we find e°!’ and e%?%, 


These functions are a basis for the nullspace. In case sg = 81, the second function is tes", 
All is completely parallel to matrix equations, until we ask this question : 


What is the “row space” of D when a differential operator has no rows ? 


I want to propose two answers to this question. They come from faithfully imitating the 
Fundamental Theorem of Linear Algebra. That theorem applies to D, because D is linear. 


Answer 1 The row space of D contains all functions y,(t) orthogonal to e*!" and e*?¢. 
Answer 2. The row space of D contains all outputs y,(t) = D7‘q(t) from inputs q(t). 
This looks good, but when are functions “orthogonal” ? What is the “transpose” of D? 


co 


Dot product of functions = 
Inner product of y,, and y, (yn (t), ur (t)) = i Yn(t) yr (t)dt. 
—oo 


Do you see this as reasonable? For vectors, we add the products v;w;. For functions, we 
integrate y,y,. If the vectors or functions are complex, we add v;w; or integrate 7, yr. 
Then (v, v) and (y, y,) give the squared lengths ||v||? for vectors and ||y,|/? for functions. 


The inner product tells us the correct meaning of the transpose. For matrices, A™ is 
the matrix that obeys the inner product law (Av, w) = (v, AT w). For differential equations, 


Co co 


(ae ‘ (f+ Bf’ + Cf)g(t)dt = i f(i)(9" — Bo! + Cg)at = (f, Dg). 


—oo —oo 


Integration by parts gave [ f’g = —f fg’. Two integrations gave [ f’g = f fg" 
with a plus sign (from two minus signs). Formally, that equation tells us D7 : 
d? d 


d Tt: d ° ° . 
aA ate = — 
+B a +C leadsto D iP a HG ( a7 is antisymmetric} 


da 


D=-— 
dt? 


Now the row space of all D™q(t) makes sense even when D has no rows. Can we just 
verify that any row space function D7 q(t) is orthogonal to any nullspace function y,,(t) ? 


(yn(t), DT q(t)) = (Dun(t), a(t) = / (0) q(t) dt = 0. 


Shakespeare said it best at the end of Hamlet: The rest is silence. 
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Chapter 6 


Eigenvalues and Eigenvectors 


6.1. Introduction to Eigenvalues 


Eigenvalues are the key to a system of n differential equations: dy/dt = ay becomes 
dy/dt = Ay. Now A is a matrix and y is a vector (yi(t),...,Yyn(t)). The vector 
y changes with time. Here is a system of two equations with its 2 by 2 matrix A: 


= Ay + : / 
vrrrs oo [eT -[Sa][g} 

yo’ = 3y. + 2y2 Y2 3 2 Y2 
How to solve this coupled system, y’ = Ay with y; and yz in both equations? The 


good way is to find solutions that “uncouple” the problem. We want y; and y2 to grow 
or decay in exactly the same way (with the same e**) : 


Y1 (t) = eta = : iN i 
Look for (t) rt In vector notation this is y(t) =e“a Bim 
yall) =e 


That vector x = (a,b) is called an eigenvector. The growth rate is an eigenvalue. This 
section will show how to find a and \. Here I will jump to x and X for the matrix in (1). 


First eigenvector x = ; = ; and first eigenvalue A = 5 in y = e"'& 
yy = eF* yy =5e% =4y1 + Yo 
Bt has ! St 
y2 =e Y2 =5e = 3y1 + 2y2 


Second eigenvector x = ; — = and second eigenvalue AX = 1 in y = e'x 


This y = e**zisa Y= e Yi e =4y,+ Yo 
second solution yo = —3et yo’ = —3et = 3y, + 2yo 
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Those two x’s and X’s combine with any c1, C2 to give the complete solution to y’ = Ay: 
5t t 1 ih 
Complete solution y(t) = ci ex + C9 & “| = ce" | + cet ey = 0G) 


This is exactly what we hope to achieve for other equations y’ = Ay with constant A. 
The solutions we want have the special form y(t) = e*'a. Substitute that solution 
into y’ = Ay, to see the equation Ax = Az for an eigenvalue \ and its eigenvector x: 


< (ea) = A(ex) is dea = Ae™a. Divide both sides by e**. 
dt 


Eigenvalue and eigenvector of A Ax = \x (4) 


Those eigenvalues (5 and 1 for this A) are a new way to see into the heart of a matrix. 
This chapter enters a different part of linear algebra, based on Ax = Ax. The last page of 
Chapter 6 has eigenvalue-eigenvector information about many different matrices. 


Finding Eigenvalues from det(A — AI) = 0 


Almost all vectors change direction, when they are multiplied by A. Certain very 
exceptional vectors x are in the same direction as Ax. Those are the “eigenvectors.” 
The vector Ax (in the same direction as x) is a number 4 times the original x. 

The eigenvalue tells whether the eigenvector x is stretched or shrunk or reversed 
or left unchanged—when it is multiplied by A. We may find ’ = 2 or 4 or —1 or 1. 
The eigenvalue \ could be zero! Ax = Oz puts this eigenvector a in the nullspace of A. 

If A is the identity matrix, every vector has Aw = a. All vectors are eigenvectors of J. 
Most 2 by 2 matrices have two eigenvector directions and two eigenvalues A, and A3. 


To find the eigenvalues, write the equation Aw = Az in the good form (A — AJ)xa = O. 
If (A — Al)x = 0, then A — AJ is a singular matrix. Its determinant must be zero. 


The determinant of A — AJ = Neaae eo is (a—A)(d—A)—be=0. 


Our goal is to shift A by the right amount AJ, so that (A — AJ)a = 0 has a solution. 
Then = is the eigenvector, is the eigenvalue, and A — XI is not invertible. So we look 
for numbers \ that make det(A — AT) = 0. I will start with the matrix A in equation (1). 


Example 1 For A = | : ; | subtract \ from the diagonal and find the determinant: 


4-—xX 1 


det(A — A1) = det| alae 


| =? -6.45=(A-5)-9). (3) 


I factored the quadratic, to see the two eigenvalues A; = 5 and Ag = 1. The matrices 
A —5l and A — J are singular. We have found the \’s from det (A — AJ) = 0. 
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For each of the eigenvalues 5 and 1, we now find an eigenvector z : 


(A-5/)2@=0 is Pees | l=] =| | aa o=| 1 | 


foetal 


Those were the vectors (a, b) in our special solutions y = e*¢a. Both components of y 
have the growth rate \, so the differential equation was easily solved: y = e*'a. 


Two eigenvectors gave two solutions. Combinations ciy, + cay give all solutions. 


II 


ASWe=2o “is ie “al | | 


Example 2 Find the eigenvalues and eigenvectors of the Markov matrix A = | e : : 


det(A — AI) = det oe 3 


2. Gay [Eee 


Nile 


I factored the quadratic into \ — 1 times A — 3, to see the two eigenvalues A = 1 and 3. 


The eigenvectors x; and x2 are in the nullspaces of A — I and A — 41 2 


(A-—I)a,=0 is Ax, =a, The first eigenvector is 21 = (.6,.4) 


(A—51)v2=0 is Aw = 45a Thesecondeigenvectoris x2 = (1,—1) 


a raed la 
1 iB 43 1 5 
r= and Ag a= — 
=] 2.7) |-1 —.5 


If x; is multiplied again by A, we still get x1. Every power of A will give A°xa; = a. 
Multiplying x2 by A gave $22, and if we multiply again we get (3)? times Xo. 


6 ES 23:1) 6 
r1= i and Ag, = | | | | =a, (Ax =a2 means that \; = 1) 


(this is 4 2 so \2 = 5). 


When A is squared, the eigenvectors x stay the same. Ax = A(Ax) = (Ax) = X22. 


Notice \*. This pattern keeps going, because the eigenvectors stay in their own directions. 
They never get mixed. The eigenvectors of A!°° are the same x and x. The eigenvalues 
of A? are 1'°° = 1 and (5)1°° = very small number. 


We mention that this particular A is a Markov matrix. Its entries are positive and 
every column adds to 1. Those facts guarantee that the largest eigenvalue must be \ = 1. 
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A’a, = (1)? 
A=1 WW Az, = 2, = i \M=1 . : ( ) . 
JP 4 ee 
< A* =:.25 < 
.25 
™%, 4 A?ae = (.5)?a2 = | “4 
5 = 
= A = — 
r 5 ‘~ x2 22 5 


ric 8.3 Az = Ax 
‘N 1 SN ide> at A's = "2 
®fo= 
a |=1 
Figure 6.1: The eigenvectors keep their directions. A? has eigenvalues 1? and (.5)?. 


The eigenvector Aw; = 2; is the steady state—which all columns of A* will approach. 


Giant Markov matrices are the key to Google’s search algorithm. It ranks web pages. 
Linear algebra has made Google one of the most valuable companies in the world. 


Powers of a Matrix 


When the eigenvalues of A are known, we immediately know the eigenvalues of all 
powers A* and shifts A + cJ and all functions of A. Each eigenvector of A is also an 
eigenvector of A* and A~! and A+ cl: 


If Ax = da then A*a = \*a and Alex = xe and (A+clI)a=(A+c)x. (6) 


Start again with A?a, which is A times Ax = \x. Then Az is the same as \Az for any 
number \, and \.Azx is \?a. We have proved that A?” = \?x. 


For higher powers A*a, continue multiplying Az = Ax by A. Step by step you reach 
Akg = \*x. For the eigenvalues of A~!, first multiply by A~ and then divide by \: 


1 
Eigenvalues of A~* are x Ac=d« @=)A%*e Ale =-2c (7) 


We are assuming that A~! exists ! If A is invertible then will never be zero. 
Invertible matrices have all \ + 0. Singular matrices have the eigenvalue \ = 0. 


The shift from A to A + cl just adds c to every eigenvalue (don’t change x) : 
Shift of A If Ax =Ax then (A+clI)x = Ar +ce = (A+ c)x. (8) 
As long as we keep the same eigenvector x, we can allow any function of A: 


Functionsof A (A2+2A+4+5I)a=(\?74+2\4+5)e9 e4x =e. (9) 
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I slipped in e4 = I + A+ $A? +--+ to show that infinite series produce matrices too. 


Let me show you the powers of the Markov matrix A in Example 2. That starting matrix 
is unrecognizable after a few steps. 


8). ad 70.45 650 = .525 .6000  .6000 
2s 3h 30.55 350.475 4000 .4000 


A A? A hee 


(10) 


A} was found by using A = 1 and its eigenvector [.6, .4], not by multiplying 100 matrices. 
The eigenvalues of A are 1 and 4, so the eigenvalues of A! are 1 and (3)'°°. That last 
number is extremely small, and we can’t see it in the first 30 digits of A1°. 

How could you multiply A°? times another vector like v = (.8,.2)? This is not an 
eigenvector, but v is a combination of eigenvectors. This is a key idea, to express any vector 
v by using the eigenvectors. 


Separate into eigenvectors 8 8 2 
v=a2,4+(.2)r0 Oe | -_ ate (11) 


Each eigenvector is multiplied by its eigenvalue, when we multiply the vector by A. 


After 99 steps, a1 is unchanged and «2 is multiplied by (5)°°: 
8 1 very 
Ae? ; is A (a, +.2a2) =a, + (.2)(5) a2 = | | + | small 
j as : vector 


This is the first column of A1°°, because v = (.8, .2) is the first column of A. The number 
we originally wrote as .6000 was not exact. We left out (.2)($)°° which wouldn’t show up 
for 30 decimal places. 

The eigenvector x; = (.6,.4) is a “steady state” that doesn’t change (because A = 1). 
The eigenvector x2 is a “decaying mode’ that virtually disappears (because \2 = 1/2). 
The higher the power of A, the more closely its columns approach the steady state. 


Bad News About AB and A+ B 


Normally the eigenvalues of A and B (separately) do not tell us the eigenvalues of AB. 
We also don’t know about A + B. When A and B have different eigenvectors, 
our reasoning fails. The good results for A? are wrong for AB and A+ B, when AB is 
different from BA. The eigenvalues won’t come from A and B separately : 


0. «1 0 0 1 0 0 0 Ol 
22 | 5 a| | aenl 27 pon) Pea leon eee hol 
All the eigenvalues of A and B are zero. But AB has an eigenvalue A = 1, and A+ B 
has eigenvalues 1 and —1. But one rule holds: AB and BA have the same eigenvalues. 
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Determinants 


The determinant is a single number with amazing properties. It is zero when the matrix has 
no inverse. That leads to the eigenvalue equation det(A — AJ) = 0. When A is invertible, 
the determinant of A~! is 1/(det A). Every entry in A~ is a ratio of two determinants. 

I want to summarize the algebra, leaving the details for my companion textbook 
Introduction to Linear Algebra. The difficulty with det(A — AJ) = 0 is that an n by n 
determinant involves n! terms. For n = 5 this is 120 terms—generally impossible to use. 


For n = 3 there are six terms, three with plus signs and three with minus. Each of 
those six terms includes one number from every row and every column: 


Determinant from n! = 6 terms 


4 5 Three plus signs, three minus signs 
+(1)(5)(9) — +(2)(6)(7) + (3) (4)(8) 
Sree ee owes —(3)(5)(7) — —(1)(6)(8)_ —(2)(4)(9) 


That shows how to find the six terms. For this particular matrix the total must be det A = 0, 
because the matrix happens to be singular: row 1 + row 3 equals 2(row 2). 
Let me start with five useful properties of determinants, for all square matrices. 


1. Subtracting a multiple of one row from another row leaves det A unchanged. 
2. The determinant reverses sign when two rows are exchanged. 

3. If A is triangular then det A = product of diagonal entries. 

4. The determinant of AB equals (det A) times (det B). 

5. The determinant of AT equals the determinant of A. 


By combining 1, 2, 3 you will see how the determinant comes from elimination : 
The determinant equals + (product of the pivots). (12) 


Property 1 says that A and U have the same determinant, unless rows are exchanged. 
Property 2 says that an odd number of exchanges would leave det A = —det U. 
Property 3 says that det U is the product of the pivots on its main diagonal. 


When elimination takes A to U, we find det A = + (product of the pivots). This is how 
all numerical software (like MATLAB or Python or Julia ) would compute det A. 

Plus and minus signs play a big part in determinants. Half of the n! terms have plus 
signs, and half come with minus signs. For n = 3, one row exchange puts 3 — 5 — 7 
or 1 — 6 — 8 or 2 — 4 — 9 on the main diagonal. A minus sign from one row exchange. 
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Two row exchanges (an even number) take you back to (2) (6) (7) and (3) (4) (8). This indi- 
cates how the 24 terms would go for n = 4, twelve terms with plus and twelve with minus. 
Even permutation matrices have det P = 1 and odd permutations have det P = —1. 


Inverse of A_ If det A 4 0, youcan solve Av = band find A~! using determinants: 


_ det By _ det Bg _ det By, 


V2 Un (13) 


Cramer’s Rule V1 


The matrix B; replaces the jh column of A by the vector b. Cramer’s Rule is expensive! 


To find the columns of A~!, we solve AA~! = I. That is the Gauss-Jordan idea: For 
each column b in I, solve Av = bto find a column v of A7!. 

In this special case, when 6 is a column of J, the numbers det B; in Cramer’s Rule are 
called cofactors. They reduce to determinants of size n — 1, because b has so many zeros. 
Every entry of A~! is a cofactor of A divided by the determinant of A. 

I will close with three examples, to introduce the “trace” of a matrix and to show 
that real matrices can have imaginary (or complex) eigenvalues and eigenvectors. 


2 
Example 3 __ Find the eigenvalues and eigenvectors of S = F i ; 


Solution You can see that a = (1,1) will be in the same direction as Sa = (3,3). 
Then z is an eigenvector of S with \ = 3. We want the matrix S — XJ to be singular. 


=n 1 


=A*—44+3=0. 
1 “ a 


2: ob 2 
S= é 1 set ($ —) = | 


Notice that 3 is the determinant of S (without A). And 4 is the sum 2 + 2 down the central 
diagonal of S. The diagonal sum 4 is the “trace” of A. It equals Ay + Az = 3+ 1. 


Now factor A? — 4\ + 3 into (\ — 3)(\ — 1). The matrix S — XJ is singular (zero 
determinant) for \ = 3 and \ = 1. Each eigenvalue has an eigenvector : 


M1 =3 (S-31a = E a Ei =(1| 


do =1 (S—Da, = | elo 


The eigenvalues 3 and 1 are real. The eigenvectors (1,1) and (1,—1) are orthogonal. 
Those properties always come together for symmetric matrices (Section 6.5). 

Here is an antisymmetric matrix with AT = —A. It rotates all real vectors by 6 = 90°. 
Real vectors can’t be eigenvectors of a rotation matrix because it changes their direction. 
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Example 4 This real matrix has imaginary eigenvalues 7, --2 and complex eigenvectors : 


_|Q =1] . T = _ -A -1]_ \9 = 
a=|t 5] =-A det(A Ar) = det | 1 3)- +1=0. 


That determinant A? + 1 is zero for \ = i and —i. The eigenvectors are (1, —7) and (1,7): 


Pb olLa}=L}=*La) ft co) = [a]--fF 


Somehow those complex vectors 7; and a2 don’t get rotated (I don’t really know how). 
Multiplying the eigenvalues (7)(—7) gives detA = 1. Adding the eigenvalues gives 
(i) + (—i) = 0. This equals the sum 0 + 0 down the diagonal of A. 


Product of eigenvalues = determinant Sum of eigenvalues = “trace” (14) 


Those are true statements for all square matrices. The trace is the sum aj; + --- + Gnn 
down the main diagonal of A. This sum and product are is especially valuable for 2 by 2 
matrices, when the determinant \;2 = ad — beand the trace \; + A2 = a + dcompletely 
determine A; and \2. Look now at rotation of a plane through any angle 0. 


Example 5 Rotation comes from an orthogonal matrix Q. Then A; = e”’ and Ay = e~”: 


oe cos@ —sin@ Ay =cosé + ising A, + Aq = 2cos @ = trace 
~ | sind cos @ Ag = cos@ —isind A, Ag = 1 = determinant 


I multiplied (A;)(Az) to get cos?@ + sin?@ = 1. In polar form e® times e~* is 1. 
The eigenvectors of Q are (1, —7) and (1, ¢) for all rotation angles 0. 


Before ending this section, I need to tell you the truth. It is not easy to find eigenvalues 
and eigenvectors of large matrices. The equation det(A — AJ) = 0 is more or less limited 
to 2 by 2 and 3 by 3. For larger matrices, we can gradually make them triangular without 
changing the eigenvalues. For triangular matrices the eigenvalues are on the diagonal. 
A good code to compute A and a is free in LAPACK. The MATLAB command is eig (A). 


= REVIEW OF THE KEYIDEAS ® 


. Ax = zx says that eigenvectors az keep the same direction when multiplied by A. 

. Ax = Ax also says that det(A — AJ) = 0. This equation determines n eigenvalues. 
. The eigenvalues of A? and A~! are \? and \~!, with the same eigenvectors as A. 

. Singular matrices have 4 = 0. Triangular matrices have ’s on their diagonal. 


. The sum down the main diagonal of A (the trace) is the sum of the eigenvalues. 


ana un & & NY = 


. The determinant is the product of the \’s. It is also + (product of the pivots). 
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Problem Set 6.1 


Example 2 has powers of this Markov matrix A: 
=. [R823 2 __|.70 .45 3) | O26 
see a anaes = |%0 i pokes -| iS 
(a) A has eigenvalues 1 and 5. Find the eigenvalues of A? and A~. 


(b) What are the eigenvectors of A°° ? One eigenvector is in the nullspace. 


(c) Check the determinant of A? and A°°. Compare with (det A)? and (det A)®. 


Find the eigenvalues and the eigenvectors of these two matrices: 


1 4 2 4: 
A=([} | and A+r=| Ik 


2 4 
A+T has the eigenvectors as A. Its eigenvalues are by 1. 
Compute the eigenvalues and eigenvectors of A and also A7!: 
aco Seen into esl 
A=[t | and A -| 1/2 alt 
A7T has the eigenvectors as A. When A has eigenvalues A; and Ag, its inverse 


has eigenvalues . Check that Aj + A2 = trace of A=0+1. 


Compute the eigenvalues and eigenvectors of A and A?: 


oe ie eee De 7 -3 
ellie} and A= | a 


as A. When A has eigenvalues A; and Ag, the eigenvalues of 
A? are . In this example, why is A? + AZ = 13? 


A? has the same 


Find the eigenvalues of A and B (easy for triangular matrices) and A+ B: 


3D 60 hea 4 1 
re | and B= | a and A+B=| le 


1 4 


Eigenvalues of A + B (are equal to) (might not be equal to) eigenvalues of A plus 
eigenvalues of B. 
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10 


11 


12 


13 
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Find the eigenvalues of A and B and AB and BA: 


ea ee) og | A ah le? 32 
AN | and oN | and AB= |; | and BA-|? ik 


(a) Are the eigenvalues of AB equal to eigenvalues of A times eigenvalues of B ? 
(b) Are the eigenvalues of AB equal to the eigenvalues of BA? Yes! 


Elimination produces a triangular matrix U. The eigenvalues of U are on its diago- 
nal (why ?). They are not the eigenvalues of A. Give a 2 by 2 example of A and U. 


(a) If you know that a is an eigenvector, the way to find J is to 


(b) If you know that \ is an eigenvalue, the way to find z is to 
What do you do to the equation Ax = Az, in order to prove (a), (b), and (c)? 


(a) \? is an eigenvalue of A’, as in Problem 4. 
(b) A~1 is aneigenvalue of A~!, as in Problem 3. 
(c) A+ 1is an eigenvalue of A + J, as in Problem 2. 


Find the eigenvalues and eigenvectors for both of these Markov matrices A and A™. 
Explain from those answers why A?°° is close to A® : 


a=[4 3] me 2e= [35 als] 


A 3 by 3 matrix B has eigenvalues 0, 1, 2. This information allows you to find: 


(a) the rank of B_ (b) the eigenvalues of B? —_(c) the eigenvalues of (B? + J)71. 


Find three eigenvectors for this matrix P. Projection matrices only have \ = 1 and 0. 
Eigenvectors are in or orthogonal to the subspace that P projects onto. 


2. 2A <O 
Projection matrix P? = P = PT P=|.4 8 O 
Oy Ory el: 


If two eigenvectors x and y share the same repeated eigenvalue \, so do all their 
combinations ca” + dy. Find an eigenvector of P with no zero components. 


From the unit vector uw = (4,4, 3,3) construct the rank one projection matrix 
P = uu". This matrix has P? = P because uu = 1. 

(a) Explain why Pu= (wu™)u equals u. Then w is an eigenvector with \=1. 

(b) If v is perpendicular to w show that Pv = 0. Then A = 0. 


(c) Find three independent eigenvectors of P all with eigenvalue \ = 0. 
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14 


15 


16 


17 


18 


19 


20 


21 


22 


23 


Solve det(Q — AI) = 0 by the quadratic formula to reach A = cosO + isin 0: 


cos@ —sin@ 


ae Me 5 rotates the zy plane by the angle 0. No real \’s. 


e-| 


Find the eigenvectors of Q by solving (Q — AI)a = 0. Use i? = —1. 


Find three 2 by 2 matrices that have 43 = Ag = 0. The trace is zero and the 
determinant is zero. A might not be the zero matrix but check that A? is all zeros. 


This matrix is singular with rank one. Find three \’s and three eigenvectors : 


1 2 1. 2 
Rank one Az|/2|)(222]=|4 2 4 
1 712 


When a + b=c + d show that (1, 1) is an eigenvector and find both eigenvalues : 


5 1 a b 
Use the trace to find 2» A= E i A= k ale 


If A has \y = 4 and Xz = 5 then det(A — AI) = (A — 4)(A— 5) = dN? — 9 + 20. 
Find three matrices that have trace a + d = 9 and determinant 20, so A = 4 and 5. 


Suppose Au = Ow and Av = 3v and Aw = 5w. The eigenvalues are 0, 3, 5. 


(a) Give a basis for the nullspace of A and a basis for the column space. 
(b) Find a particular solution to Ax = v + w. Find all solutions. 


(c) Ax=vw has no solution. If it did then would be in the column space. 


Choose the last row of Ato produce (a) eigenvalues 4 and 7 (b) any A; and Xo. 


Companion matrix A= Fe | : 


The eigenvalues of A equal the eigenvalues of A. This is because det(A — XJ) 
equals det(A? — XJ). That is true because . Show by an example that the 
eigenvectors of A and A” are not the same. 


Construct any 3 by 3 Markov matrix / : positive entries down each column add to 1. 
Show that M™(1,1,1) = (1,1,1). By Problem 21, \ = 1 is also an eigenvalue 
of M. Challenge: A 3 by 3 singular Markov matrix with trace 3 has what \’s ? 


Suppose A and B have the same eigenvalues A;,. . ., An with the same independent 
eigenvectors 21,...,Z%,. Then A = B. Reason: Any vector v is a combination 
C121 +++++CpXp. What is Av ? What is Bu ? 
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24 


25 


26 


27 


28 


29 


30 


The block B has eigenvalues 1,2 and C’ has eigenvalues 3,4 and D has eigenval- 
ues 5, 7. Find the eigenvalues of the 4 by 4 matrix A: 


OR e320 

ie BeiG.| & r= 2" 3°04 

a oe Baa O% 207 767 Al 

OE gel 6: 

Find the rank and the four eigenvalues of A and C’: 

Leh deh IesQy it 20 
Ses ire eit OF ESO 
STE gel eae eM woo 
Le ly alg tl: OP FO 


ie ee tee (Cabs ees 
i Oeet of aX ea ee ae 
Der A Nat ge al; eae Se ee Seas wees 
ro as De Sil vas ah wa) 


(Review) Find the eigenvalues of A, B, and C: 
a oe 00 1 Dade 22 
A= 0.45 and B=|0 2 0O An Cs | 2d 
0 0 6 3010" 0 De De? 


Every permutation matrix leaves x = (1,1,...,1) unchanged. Then \ = 1. Find 
two more \’s (possibly complex) for these permutations, from det(P — AJ) = 0: 


0 1 0 0 0 1 
P=10 0 1 and P=j|0 1 0 
1 0 0 k. OF YG 


The determinant of A equals the product \;\2--- An. Start with the polynomial 
det(A — AL) separated into its n factors (always possible). Then set \ = 0: 


det(A — AI) = (Ar — A)(A2 — A)-- (An — A) so) det A= 


The sum of the diagonal entries (the trace) equals the sum of the eigenvalues: 


oN | has det(A— AI) = 7 — (a + d)\ + ad — be = 0. 


The quadratic formula gives the eigenvalues \=(a+d+,/ )/2andA= 
Their sum is .If A has A, = 3 and Ag = 4 then det(A — ATI) = 
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6.2 Diagonalizing a Matrix 


When a is an eigenvector, multiplication by A is just multiplication by a number \: 
Az = Ax. All the difficulties of matrices are swept away. Instead of an interconnected 
system, we can follow the eigenvectors separately. It is like having a diagonal matrix, with 
no off-diagonal interconnections. The 100th power of a diagonal matrix is easy. 

The point of this section is very direct. The matrix A turns into a diagonal matrix A 
when we use the eigenvectors properly. This is the matrix form of our key idea. We start 
right off with that one essential computation. 


Diagonalization Suppose the n by n matrix A has n linearly independent eigenvectors 
21,...,2n. Put them into the columns of an eigenvector matrix V. Then V~! AV is the 
eigenvalue matrix A, and A is diagonal : 


AL 
VoAV SAS a (1) 
An 


Eigenvector matrix V 
Eigenvalue matrix A 


The matrix A is “diagonalized.” We use capital lambda for the eigenvalue matrix, because 
of the small )’s (the eigenvalues) on its diagonal. 


Proof Multiply A times its eigenvectors, which are the columns of V. The first column of 
AV is Aa. That is A; x1. Each column of V is multiplied by its eigenvalue ); : 


A times V AV=A Li aes Ln = Ai 2] sates NnEr 


The trick is to split this matrix AV into V times A: 
Ai 
V times A Ay panne An&n = |e «sr Ln — =VA. 


An 


Keep those matrices in the right order! Then A; multiplies the first column aj, as shown. 
The diagonalization is complete, and we can write AV = V A in two good ways: 


AV=VA is V-!AV=A or A=VAV-}. (2) 


The matrix V has an inverse, because its columns (the eigenvectors of A) were assumed 
to be linearly independent. Without n independent eigenvectors, we can’t diagonalize. 

A and A have the same eigenvalues \1,...,An. The eigenvectors are different. The 
job of the original eigenvectors 7 ,...,£%,, was to diagonalize A. Those eigenvectors in V 
produce A = VAV~—!. You will soon see the simplicity and importance and meaning of 
the k th power A® = VAFV-!. 
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Sections 6.2 and 6.3 solve first order difference and differential equations. 


Ukr, = Aur A®ug =ciARay +++ +en\ an 


dy/dt = Ay = e4ty(0) = cye™*a + +++ + cne**an. 


The idea is the same for both problems: n independent eigenvectors give a basis. 
We can write wo and y(0) as combinations of eigenvectors. Then we follow each eigen- 
vector as k increases and t increases: A*x is \*a and e4*x is er'z. 


Some matrices don’t have n independent eigenvectors (because of repeated .’s). 
Then A*uo and e4*y(0) are still correct, but they lead to kA” & and te**ax : not so good. 


Example 1 Here A is triangular so the ’s are on its diagonal: 1 = 1 and \ = 6. 


: ; 1 -1 1 5 aaa x 1 0 
Eigenvectors in V Ee | Fei laals fe a 


Vaz A V A 


In other words A = VAV~—!. Then watch A? = VAV~!VAV~—!. When you remove 
V-!V=I, this becomes A? = VA?2V —!. The same eigenvectors for Aand A? are in V. 
The squared eigenvalues are in A?. 


The & th power will be A* = VA*V—!. And A¥ just contains 1* and 6* : 


k 
k Cer Tt F 1 1 -1]_[1 6*-1 
pom Loe] =[oil|* #][o a]-[o “#'] 


With k = 1 we get A. With k = 0 we get A® = I (eigenvalues \° = 1). With k = —1 
we get the inverse A~!. You can see how A? = [1 35; 0 36] fits the formula when k = 2. 


Here are four remarks before we use A again. 


Remark 1 When the eigenvalues \1,.. ., An are all different, the eigenvectors x1,.. ., Zn 
are independent. Any matrix that has no repeated eigenvalues can be diagonalized. 


Remark 2 We can multiply eigenvectors by any nonzero constants. Ax = Xx will remain 
true. In Example 1, we can divide the eigenvector (1,1) by \/2 to produce a unit vector. 


Remark 3 The eigenvectors in V come in the same order as the eigenvalues in A. To reverse 
the order 1, 6 in A, put the eigenvector (1,1) before (1,0) inV: 


New order 6, 1 ) Ad 1 5 ea | & Ol nN 
New order in V 1 -1 0 6 ae ae ae Oe 
To diagonalize A we must use an eigenvector matrix. From V~1AV = A we know that 


AV =VA. Suppose the first column of V is x. Then the first columns of AV and V A are 
Az and 4,2. For those to be equal, x must be an eigenvector. 
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Remark 4 (Warning for repeated eigenvalues) Some matrices have too few 
eigenvectors (less than n). Those matrices cannot be diagonalized. Here are examples: 


Not diagonalizable oe el eel, SSO Pa 
Only 1 eigenvector A= 1 -1 | ae i | 0 0 ; 


Their eigenvalues happen to be 0 and 0. The problem is the repetition of A. 


Only one line Cp aes Ae to ; Lab 
of eigenvectors Ree awe is 2st Neal Seo) and x=c], |. 


There is no second eigenvector, so the unusual matrix A cannot be diagonalized. 

Those matrices are the best examples to test any statement about eigenvectors. In many 
true-false questions, non-diagonalizable matrices lead to false. 

Remember that there is no connection between invertibility and diagonalizability : 


— Invertibility is concerned with the eigenvalues (X = 0 or X # 0). 
— Diagonalizability needs n independent eigenvectors. 


Each eigenvalue has at least one eigenvector! A — AJ is singular. If (A — AJ)x = 0 
leads you to a = O, X is not an eigenvalue. Look for a mistake in solving det(A — AJ) = 0. 


Eigenvectors for n different \’s are independent. Then V~!.AV = A will succeed. 
Eigenvectors for repeated ’s could be dependent. V might not be invertible. 


Example 2. Powers of A The Markov matrix A in the last section had A; = 1 and 
Ag = .5. Here is A = VAV—! with those eigenvalues in the matrix A: 


rae le aca lit sll fee ee 


The eigenvectors (.6, .4) and (1, —1) are in the columns of V. They are also the eigenvectors 
of A?. Watch how A? has the same V, and the eigenvalue matrix of A? is A?: 


Same V for A? A? SVAV *VAV *= VA2V-. (3) 
Just keep going, and you see why the high powers A* approach a “steady state” : 


pte epee tebe eat eG 1 ol 
PowersofA A*=VA"V ale | 0 (.5)* A —6 |" 


As k gets larger, (.5)* gets smaller. In the limit it disappears completely. That limit is A® : 


a 6s & 
Limit k —+ co cae elena lt el a = 


The limit has the steady state eigenvector x, in both columns. 


Question When does A® —> zero matrix ? Answer All |A| <1. 
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Fibonacci Numbers 


We present a famous example, where eigenvalues tell how fast the Fibonacci numbers grow. 
Every new Fibonacci number is the sum of the two previous F’s : 


The sequence 0,1,1,2,3,5,8,13,... comes from Fy.4.9 = Fyiy + Fp. 


These numbers turn up in a fantastic variety of applications. Plants a grow in spirals, and a 
pear tree has 8 growths for every 3 turns. The champion is a sunflower that had 233 seeds in 
144 loops. Those are the Fibonacci numbers F3 and F)2. Our problem is more basic. 


Problem: Find the Fibonacci number Fo. The slow way is to apply the rule 
Frio = Frei + Fy one step at a time. By adding fg = 8 to Fy = 13 we reach Fg = 21. 
Eventually we come to F\o9. Linear algebra gives a better way. 

The key is to begin with a matrix equation uz: = Aux. That is a one-step rule for 
vectors, while Fibonacci gave a two-step rule for scalars. We match those rules by putting 
two Fibonacci numbers into a vector w,. Then you will see the matrix A. 


_ | Frea Frog = Pepi th . ee eee 
ue =| Fi |: The rule Raat ea iS Unt =| 4 o | Ue (5) 


Every step multiplies by A = [+ 4]. After 100 steps we reach wio99 = Alu 


me) | _ a _|2 _ fe _ | Fiat 
Up = o|° uU; = 14? U2 = 1}? U3 = 2]? Shey Ui00 = ae 5 


This problem is just right for eigenvalues. To find them, subtract AJ from A: 


1 


c= 
A-v=| ame 


| leadsto det(A— AI) =27-A-1. 


The equation \? — A — 1 = 0 is solved by the quadratic formula (- b+? — 4ac 4ac) /2a: 


1475 1-5 
2 


Eigenvalues Ay = x 1.618 and ae a x= —.618. 


These eigenvalues lead to eigenvectors 2; = (Aj, 1) and 2 = (Az, 1). Step 2 finds the 
combination of those eigenvectors that gives up = (1,0): 


1 = 1 Mi - A2 @1— 22 
Hee (Fal i |) a ae OT e 


Step 3 multiplies the eigenvectors x1 and x2 by (1)! and (Az)! : 


ee (Az)! aro 


A? times uo u100 = aaa ‘ (7) 
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We want Fo9 = second component of %1099. The second components of x; and x2 are 1. 
The difference between (1 + /5)/2 and (1 — /5)/2 is Ay — Ax = V5. We have Fiq9: 


100 100 
1 (4) -(5*) | ase" (8) 


Fino = = 
V5 
Is this a whole number? Yes. The fractions and square roots must disappear, because 


Fibonacci’s rule Fx42 = Fy+1 + F stays with integers. The second term in (8) is less 
than 5 so it must move the first term to the nearest whole number : 


Ne — vk 1 es) 


2 2 


(9) 


Te nearest integer to 3 5 
The ratio of Fg to Fs is 8/5 = 1.6. The ratio Fyo;/Fi00 must be very close to the 
limiting ratio (1 + V5 Ne 2. The Greeks called this number the “golden mean”. 
For some reason a rectangle with sides 1.618 and 1 looks especially graceful. 


kth Fibonacci number = 


Matrix Powers Ak 


Fibonacci’s example is a typical difference equation w,z41 = Aux. Each step multiplies 
by A. The solution is uz, = A*uo. We want to make clear how diagonalizing the matrix 
gives a quick way to compute A* and find wu, in three steps. 

The eigenvector matrix V produces A = VAV~—!. This is perfectly suited to computing 
powers, because every time V—1 multiplies V we get I: 


Powersof A A*ug = (VAV—!)---(VAV—1) uo = VAFV—1 a9 


I will split VA*V—!a into three steps. Equation (10) puts those steps together in wp. 


1. Write uo as a combination cx; + --- + Cna@, of the eigenvectors. Thenc = V-!uo. 
2. Multiply each number ¢; by (\;)”. Now we have A’V~1 up. 


3. Add up the pieces ¢;(\;)*a; to find the solution u, = A*up. This is VA*V~! uo. 
_ sh — k k 
wr = A*® up = €1(A1)"a1 +++ + en(An)*a@n.- (10) 


In matrix language A* wo equals (VAV~!)* ao. The 3 steps are V times A* times V~! uo. 


I am taking time with the three steps to compute A” uo, because you will see exactly the 
same steps for differential equations and e“*’. The equation will be dy/dt = Ay. 
Please compare equation (10) for A* uo with this solution e4¢y(0) from Section 6.3. 


Solve dy/dt = Ay y(t) = e4*y(0) = cye™*ay +--+ cne**an. — (11) 


Those parallel equations (10) and (11) show the point of eigenvalues and eigenvectors. 
They split the solutions into n simple pieces. By following each eigenvector separately—this 
is the result of diagonalizing the matrix—we have n scalar equations. 
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The growth factor A* in (10) is like e** in (11). 
Summary _ [I will display the matrices in those steps. Here is up = Wc: 


C1 
This says that 


Step 1 U=|% - & 
r : ‘ tig = C101 + +++ + Cnty 


(12) 


Cn 


The coefficients in Step 1 are c = V~!uo. Then Step 2 multiplies by A*. Then Step 3 
adds up all the c;(A;)*a; to get the product of V and AF and V~!uo: 


(A1)* Cy 
A®uyp =VA'V-!up = | w1 ..- On "ee : . @3) 
a5 Cn 
This result is exactly wz = c1(A1)*a1 + +++ + Cn(An)¥atn. It solves uz41 = Aur. 


Example 3 Start from wo = (1,0). Compute A*up when V and A contain these eigen- 
vectors and eigenvalues : 


As|? @) fe meet ae wel! |; Seomt oat eyo) = 
a 1 0 as i= an r= 1 3 2> an r= -] " 
This matrix A is like Fibonacci except the rule is changed to Fy4g = Fe4i + 2Frp. 


The new numbers 0, 1, 1,3,... grow faster because \ = 2 is larger than (1 + V/5)/2. 


Example 3 in three steps Find wo = c1@ + cow and uy, = c)(Ai)* ay + co(A2)* ae 


1 oy 2 1 1 1 
Step 1 m=(9/=3/7|+3| 4] so Saye 
Step 2 Multiply the two eigenvectors by (A)* = 2* and (A2)* = (—1)* 
: : : ee 2 1 : 1 
Step 3 Combine the pieces into uz, = 32 aes 3(-1) eqn | 


Behind these examples lies the fundamental idea: Follow each eigenvector. 


Nondiagonalizable Matrices (Optional) 


Suppose J is an eigenvalue of A. We discover that fact in two ways: 
1. Eigenvectors (geometric) There are nonzero solutions to Ax = Aa. 


2. Eigenvalues (algebraic) The determinant of A — AJ is zero. 
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The number \ may be a simple eigenvalue or a multiple eigenvalue, and we want to know 
its multiplicity. Most eigenvalues have multiplicity MM = | (simple eigenvalues). Then there 
is a single line of eigenvectors, and det(A — AJ) does not have a double factor. 

For exceptional matrices, an eigenvalue can be repeated. Then there are two different 
ways to count its multiplicity. Always GM < AM for each eigenvalue. 


1. (Geometric Multiplicity = GM) Count the independent eigenvectors for \. 
This is the dimension of the nullspace of A — XJ. 


2. (Algebraic Multiplicity = AM) Count the repetitions of the same A among 
the eigenvalues. Look at the n roots of det(A — AJ) = 0. 


If A has A = 4, 4, 4, that eigenvalue has AM = 3 (triple root) and GM = 1 or 2 or 3. 
The following matrix A is the standard example of trouble. Its eigenvalue \ = 0 is 


repeated. It is a double eigenvalue (AM = 2) with only one eigenvector (GM = 1). 


AM = 2 =e 1 


—_|-A 1 \g A=0,0 but 
GM=1 0 5 has det(A ~ M) =| 0 | 


1 eigenvector 


There “should” be two eigenvectors, because 2 = 0 has a double root. The double factor 
d? makes AM = 2. But there is only one eigenvector 2 = (1,0). This shortage of 
eigenvectors when GM is below AM means that A is not diagonalizable. 


These three matrices have \ = 5, 5. Traces are 10, determinants are 25. They only have 
one eigenvector: 


Semel =a One 1 = te 2 
A=| | and a=($ ra and rae a 


Those all have det(A — AI) = (A — 5)?. The algebraic multiplicity is AM = 2. But each 


A — 5I has rank r = 1. The geometric multiplicity is GM = 1. There is only one line of 
eigenvectors for \ = 5, and these matrices are not diagonalizable. 


= REVIEW OF THE KEYIDEAS ® 


1. If A has n independent eigenvectors 71,..., Zn, they go into the columns of V. 


A is diagonalized by V V-1AV=A and A=VAV-—}. 


2. The powers of A are A* = VA*V~—!. The eigenvectors in V are unchanged. 


3. The eigenvalues of A* are (Ai)*,...,(An)* in the matrix A*. 
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4. The solution to uz, = Aug starting from uo is up = A* ug = VAFV—lu : 


Uy, = e1(Ai)*ay free Cn(An)* an provided wo = c)2) +---+Cp2n. 


That shows Steps 1, 2,3 (c’s from V~!uo, powers A* from A*, and x’s from V). 


= WORKED EXAMPLES #8 


6.2 A Find the inverse and the eigenvalues and the determinant of A: 


4 -1 -1 -1 
= 4 -1 -l 
—-1l -l 4 -1 
-1 -1 -1 4 


A = 5 « eye(4) — ones(4) = 


Describe an eigenvector matrix V that gives V~!'AV = A. 


Solution What are the eigenvalues of the all-ones matrix ones(4) ? Its rank is certainly 1, 
so three eigenvalues are 1 = 0,0,0. Its trace is 4, so the other eigenvalue is A = 4. 
Subtract the all-ones matrix from 5J to get our matrix A = 57 — ones(4) : 


Subtract the eigenvalues 4, 0, 0, 0 from 5, 5, 5,5. The eigenvalues of A are 1,5, 5,5. 


The X's add to 16. Sodoes 4+ 4+ 4+ 4 from diag (A). Multiply \’s: det A = 125. 


The eigenvector for \ = 1 is x = (1,1,1,1). The other eigenvectors are perpendicular 
to a (since A is symmetric). The nicest eigenvector matrix V is the symmetric orthogonal 
Hadamard matrix. Multiply by 1/2 to have unit vectors in its columns. 


1 1 1 1 
ie Lost 
1 
1 


if 
Orthonormal eigenvectors V = Q = 5 jae eo ie Qt=qr. 


-1 -1l 1 
The eigenvalues of A~! are 1,4, %,%. The eigenvectors are the same as for A. This 
inverse matrix A~1 = QA~!Q7? is surprisingly neat : 


2. cle, AS Xl 
1 1 AS” 2s pelted 

5 eae ey 
At= 5 * (eye(4) + ones(4)) ale ee ee 
Let 1.2 


To check that AA~! = J, use (ones) (ones) = 4(ones). Question: Can you find A? ? 
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Problem Set 6.2 


Questions 1-7 are about the eigenvalue and eigenvector matrices A and V. 


1 (a) Factor these two matrices into. A = VAV—!: 
ee, Hive al 
A=|9 | and relreak 
(b) If A=VAV-! then A?=( )( )( )andA+=( )( (+). 


2 If A has 4, = 2 with eigenvector 7; = [3] and A» = 5 with a = [3], 
use VAV —! to find A. No other matrix has the same \’s and 2’s. 


3 Suppose A = VAV~—!. What is the eigenvalue matrix for A + 2]? What is the 
eigenvector matrix ? Check that A+27=( )( )( )7}. 


4 True or false : If the columns of V (eigenvectors of A) are linearly independent, then 
(a) A is invertible (b) Ais diagonalizable 
(c) V is invertible (d) V is diagonalizable. 


5 If the eigenvectors of A are the columns of J, then A is a matrix. If the eigen- 
vector matrix V is triangular, then V~? is triangular. Prove that A is also triangular. 


6 Describe all matrices V that diagonalize this matrix A (find all eigenvectors) : 
a-[{ 9] 
Then describe all matrices that diagonalize A~?. 
7 Write down the most general matrix that has eigenvectors [+] and [_}]. 
Questions 8-10 are about Fibonacci and Gibonacci numbers. 
8 Diagonalize the Fibonacci matrix by completing V~!: 
Proj-[P F]lo x} | 
1 0 : i 0 As 


Do the multiplication VA*V-1 [3] to find its second component. This is the kth 
Fibonacci number Fy, = (Af — A$) /(A1 — Az). 


9 Suppose G;,42 is the average of the two previous numbers G41 and G, : 
Gry2 = $Gr4i + $Gr ie Gr+o | _ A Greet | 
Gro = Get Gr+1 Gt, 


(a) Find A and its eigenvalues and eigenvectors. 
(b) Find the limit as n — 00 of the matrices A” = VA"V—}. 
(c) If Go = 0 and G; = 1 show that the Gibonacci numbers approach 2. 
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10 = Prove that every third Fibonacci number in 0, 1,1, 2,3,... is even. 
Questions 11-14 are about diagonalizability. 
11. True or false: If the eigenvalues of A are 2, 2, 5 then the matrix is certainly 
(a) invertible (b) diagonalizable (c) not diagonalizable. 
12 True or false: If the only eigenvectors of A are multiples of (1, 4) then A has 
(a) noinverse (b) arepeatedeigenvalue (c) no diagonalization VAV~—!. 
13 Complete these matrices so that det A = 25. Then check that \ = 5 is repeated— 
the trace is 10 so the determinant of A — \J is (\ — 5)?. Find an eigenvector with 


Ag = 5a. These matrices will not be diagonalizable because there is no second line 
of eigenvectors. 


8 9 4 ,. | 10 5 
A=| | and A=| ‘4 and a=| 3 | 


14 = The matrix A = [2 4] is not diagonalizable because the rank of A — 3/ is 
Change one entry to make A diagonalizable. Which entries could you change ? 


Questions 15-19 are about powers of matrices. 


15 A* — VA*V~—! approaches the zero matrix as k —> 00 if and only if every \ has 
absolute value less than ___. Which of these matrices has A* > 0? 


6 9 6 9 
=| “a 5 Aa=| 4 ab 


16 (Recommended) Find A and V to diagonalize A; in Problem 15. What is the limit 
of A* as k —+ 00? What is the limit of VA*V~—! 2? In the columns of this limiting 
matrix you see the __ 


17. Find A and V to diagonalize A» in Problem 15. What is (A2)!°uo for these uo ? 


wo =| | and uo =| 7 | and wo =| 6 |: 


18 Diagonalize A and compute VA*V~—! to prove this formula for A* : 


ae 1) 1+ 1-3 
ee | - Abas | toe 1+3* |° 
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19 


20 


21 


22 


23 


24 


25 


26 


27 


Diagonalize B and compute VA‘V—! to prove this formula for B* : 


ad fee . (or Pe 
B=|¢ a has Bee 7 4k ; 


Suppose 4 = VAV~!. Take determinants to prove det A = det A = A, A2°-- An. 
This quick proof only works when A can be 


Show that trace VT = trace TV, by adding the diagonal entries of VT' and TV : 


ois @eb _|q@qr 
le a and (77 |. 


Choose T as AV~!. Then VAV—! has the same trace as AV~1V = A. The trace 
of A equals the trace of A, which is certainly the sum of the eigenvalues. 


AB — BA = I is impossible since the left side has trace = _. But find an 
elimination matrix so that A = E and B = E™ give 


—1 0 


AB BA=| 01 


which has trace zero. 


If A = VAV~—?, diagonalize the block matrix B = [4,9]. Find its eigenvalue and 
eigenvector (block) matrices. 


Consider all 4 by 4 matrices A that are diagonalized by the same fixed eigenvector 
matrix V. Show that the A’s form a subspace (cA and A; + Ag have this same V). 
What is this subspace when V = I? What is its dimension ? 


Suppose A? = A. On the left side A multiplies each column of A. Which of our four 
subspaces contains eigenvectors with 4 = 1? Which subspace contains eigenvectors 
with A = 0? From the dimensions of those subspaces, A has a full set of independent 
eigenvectors. So every matrix with A? = A can be diagonalized. 


(Recommended) Suppose Az = Aw. If A = 0 then z is in the nullspace. If A 4 0 
then x is in the column space. Those spaces have dimensions (n — r) +r = n. So 


why doesn’t every square matrix have n linearly independent eigenvectors ? 


The eigenvalues of A are 1 and 9, and the eigenvalues of B are —1 and 9: 
5 4 4 5 
A=|o 5 and alee 


Find a matrix square root of A from R = VVAV7—!. Why is there no real matrix 
square root of B? 


348 Chapter 6. Eigenvalues and Eigenvectors 


28 


29 


30 


31 


32 


33 


34 


35 


The powers A* approach zero if all |\;| < 1 and they blow up if any |\;| > 1. 
Peter Lax gives these striking examples in his book Linear Algebra: 


vote ele ek een 
(ates) SS 190700 B1024 ae i C1024 =—2€ || 1024 || < 10-78 


Find the eigenvalues \ = e”° of B and C to show B4 = I and C? = —I. 


If A and B have the same 1’s with the same full set of independent eigenvectors, 
their factorizations into are the same. So A = B. 


Suppose the same V diagonalizes both A and B. They have the same eigenvectors 
in A =VA,V~! and B = VAoV~!. Prove that AB = BA. 


(a) If A = [28] then the determinant of A — AI is (A — a)(\ — d). Check the 
“Cayley-Hamilton Theorem” that (A — aI)(A — dl) = zero matrix. 


(b) Test the Cayley-Hamilton Theorem on Fibonacci’s A = Ee: ale The theorem 


predicts that A? — A — I = 0, since the polynomial det(A — AZ) is X27 — A— 1. 


Substitute A = VAV~! into the product (A — \1I)(A — A2I)--+(A — AnJ) and 
explain why this produces the zero matrix. We are substituting the matrix A for the 
number in the polynomial p(\) = det(A — AI). The Cayley-Hamilton Theorem 
says that this product is always p(A) = zero matrix, even if A is not diagonalizable. 


Challenge Problems 
The nth power of rotation through 6@ is rotation through né : 


An = cos@ —sind my cosn@ —sinné 
“~~ T| sin@  cosé ~ | sinné cosné |" 


Prove that neat formula by diagonalizing A = VAV~!. The eigenvectors (columns 
of V) are (1,7) and (i, 1). You need to know Euler’s formula e”? = cos 6 + isin@. 


The transpose of A = VAV~1 is AT = (V~!)TAV™. The eigenvectors in ATy = 
dy are the columns of that matrix (V~!)?. They are often called left eigenvectors. 


How do you multiply three matrices VAV~! to find this formula for A? 


Sum of rank-1 matrices A =VAV7! = Aye1yp fore t Anny. 


The inverse of A = eye(n) + ones(n) is A~' = eye(n) + C * ones(n). Multiply 
AA~" to find that number C (depending on n). 
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6.3 Linear Systems y’ = Ay 


This section is about first order systems of linear differential equations. The key words are 
systems and linear. A system allows n equations for n unknown functions y;(t),..., Yyn(t). 
A linear system multiplies that unknown vector y(t) by a matrix A. Then a first order 
linear system can include a source term q(t), or not: 


d 
Without source ] = Ay(t) With source 


Without a source term, the only input is y(0) at the start. With q(t) included, there is 
also a continuing input q(t)dt between times t and t + dt. Forward from time ¢, this in- 
put grows or decays along with the y(t) that just arrived from the past. That is important. 


The transient solution y,,(¢) starts from y(0), when g(t) = O. The output coming 
from the source q(t) is one particular solution y,(t). Linearity allows superposition ! 
The complete solution with source included is y(t) = y;,(t) + yp(t) as always. 


The serious work of this section is to find y,,(t), the null solution to T= Ay, = 0. 
Then Section 6.4 accounts for the source term q(t) and finds a particular solution. 
We want to use the eigenvalues and eigenvectors of A. We don’t want those to change 
with time. So we kept our equation linear time-invariant, with a constant matrix A. For- 
tunately, many important systems have A = constant in the first place. The system is not 


changing, it is only the state of the system that changes: constant A, evolving state y(t). 
At 


We will express y(t) as a combination of eigenvectors of A. Section 6.4 uses e 


Solution by Eigenvectors and Eigenvalues 


Suppose the n by n matrix A has n independent eigenvectors. This is automatic if A 
has n different eigenvalues A. Then the eigenvectors 2),...,2@, are a basis in which we 
can express any starting vector y(0) : 


Initial condition y(0) =c1a%1 +---+Cna@, forsome numbers ¢i,...,Cn. (1) 
Computing the c’s is Step 1 in the solution, after finding the \’s and x’s. 
Step 2 solves the equation y’ = Ay using y = e*‘a. Start from any eigenvector: 


d 
If Ax =x then y(t) = er*a solves 7 = Ay. (2) 


This solution y = e**ax separates the time-dependent e* from the constant vector x : 
d 
— = Ay becomes 5 (ea) = dete = A(e**z). (3) 


Step 3 is the final solution step. Add the n separate solutions from the n eigenvectors. 


(4) 


At t = 0 this matches y(0) in equation (1). That was Step 1, where we chose the c’s. 
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1 


; 5 y. Which solution has y(0) = | ? 


Example 1 Find all solutions to y’ = i 3 


Solution First we find \ = —1 and — 3. Their eigenvectors #1 and x2 go into V: 


det es ‘ ae " | =)? 4443  factorsinto (A +1)(A+3) 


ama, Lt alla}-[a) [4 )[4}-[5] 


Step 1 Solve y(O) = Vc. Then y(0) is a mixture 4x1 + 22 of the eigenvectors: 


w= [f i}fe}-Le] om [2]-Le}-™ [el-*E 1] 


t 


Step 2 finds the separate solutions ce**a given by 4e~*a, and 2e~°*a a. Now add: 


4 me yy) —3t 
e e i (5) 


me ee Sa Coal 
Step3 = y(t) = 4e E + 2e Bl eae pee 


For a larger matrix the computations are harder. The idea doesn’t change. 


Now I want to show a matrix with complex eigenvalues and eigenvectors. This will 
lead us to complex numbers in y(t). But A is real and y(0) is real, so y(t) must be real! 
Euler’s formula e** = cost + isint will get us back to real numbers. 


—2 


—1 


| y. Which solution has y(0) = | a 


Example 2 Find all solutions to y’ = | 


Solution Again we find the eigenvalues and eigenvectors, now complex : 


det (A — XI) =0 aet| 27 : 


<= 2 
—1 jE | = A*7+4X+5 (no real factors) 


We use the quadratic formula to solve \? + 4\ + 5 = 0. The eigenvectors are x = (1, +). 
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To solve y’ = Ay, Step 1 expresses y(0) = (6,2) as a combination of those eigenvectors : 


y(0) = Ve= e121 + Coxe [2 ]=@-a[; J+e+al 4]. 


Step 2 finds the solutions cje*!*a, and cge*?*a2. Step 3 combines them into y(t): 


Solution y(t) = ce *ay +c9e*?! a9 a ee peer : | +@raer te | ; 


As expected, this looks complex. As promised, it must be real. Factoring out e~** leaves 


6 cost+2 sin t 
aaa (6) 


(3 — 2)(cos t+ sin t) E + (3+ %)(cos t —7 sin t) gE = 
Put back the factor e~”* to find the (real) y(t). It would be wise to check y’ = Ay: 


(7) 


_28 E cos t + 2 a 


2cost—6 sin t 


The factor e~*! from the real part of \ means decay. The cost and sint factors from the 
imaginary part mean oscillation. The oscillation frequency in cost = coswt isw = 1. 


Note The —2’s on the diagonal of A (which is exactly —2J) are responsible for the 
real parts —2 of the \’s. They give the decay factor e~*’. Without the —2’s we would 
only have sines and cosines, which converts into circular motion in the y, — y2 plane. 
That is a very important example to see by itself. 


Example 3 Pure circular motion and pure imaginary eigenvalues 


/ 
y= 7 * = ; = sl sends y around a circle. 


Discussion The equations are y, = yo and y4 = —y. One solution is y; = sint and 
y2 = cost. A second solution is y; = cost and yy = —sint. We need two solutions to 
match two required values y;(0) and y2(0). Those solutions would come in the usual way 
from the eigenvalues \ = +i and the eigenvectors. 

Figure 6.2a shows the solution to Example 2 spiralling in to zero (because of e 
Figure 6.2 b shows the solution to Example 3 staying on the circle (because of sine and 
cosine). These are good examples to see the “phase plane” with axes y; and y;' = yp. 

0 1 
=1)0 
y’ is at a 90° angle with y. That keeps y moving in a circle. Its length is constant: 


Ah) 


Without the —2’s, the matrix A = is a rotation by 90°. At every instant, 


Constant length d ; 
Ciciidenebit qv +3) = yy + 2yoys = 2yiyo—2yoyi =0. (8) 
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Y2 Yo 
y (0) = (6,2) 
a y’(0) = (2, -6) 
( y(0) = (6, 2) 
| ++ > Yi 
\ Spiral 'y’(0) = (—10, —10) 4 Circle _ 
y(t) ’ 
= aie y(t oO 1 
a=(71 2 = [2 
yi +y3 = 40 e-* yj? + y3 = 40 


Figure 6.2: (a) The solution (7) including e~?'. (b) The solution (6) without e~*. 


Conservative Motion 


Travel around a circle is an example of conservative motion for n = 2. The length of y 
does not change. “Energy is conserved.” For n = 3 this would become travel on a sphere. 
For n > 3 the vector y would move with constant length around a hypersphere. 

Which linear differential equations produce this conservative motion ? We are asking for 
the squared length ||y||? = y™ y to stay constant. So its derivative is zero : 


x ins d = on a ‘ on 
qy ¥) = (4) yty' = = (Ay)'y+y"(Ay) =y"(AT+ A)y=0. (9) 


The first step was the product rule. Then dy/dt was replaced by Ay. Conclusion: 


\|y||?_ is constant when A is antisymmetric: AT + A =O and AT =—A. (10) 

, : 0 ed 3 en 
The simplest example is A = PA aa Then y goes around the circle in Figure 6.2b. 
The initial vector y(0) decides the size of the circle: ||y(t)|| = ||y(0)|| for all time. 


When A is antisymmetric, its eigenvalues are pure imaginary. This comes in Section 6.5. 


Stable Motion 


Motion around a circle is only “neutral” stability. For a truly stable linear system, the 
solution y(t) always goes to zero. It is the spiral in Figure 6.2a that shows stability: 


A= + fe | has eigenvalues 1 = —2+i. This A isa stable matrix. 


The key is in the eigenvalues of A, which give the simple solutions y = e*’a. When 
A is diagonalizable (n independent eigenvectors), every solution is a combination of 


e*1'g1,...,e°"ta,. So we only have to ask when those simple solutions approach zero : 


Stability eta — 0 when the real part of \ is negative: ReA < 0. 
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The real parts —2 give the exponential decay factor e~** in the solution y. That 


factor produces the inward spiral in Figure 6.2a and the stability of the equation y’ = Ay. 
The imaginary parts of \ = —2 +2 give oscillations: sines and cosines that stay bounded. 


Test for Stability When n = 2 


For a 2 by 2 matrix, the trace and determinant tell us both eigenvalues. So the trace and 
determinant must decide stability. A real matrix A has two possibilities R and C: 


R_ Realeigenvalues 1 and A2 
C Complex conjugate pair A} = s+iw and Ag = s — iw 


Adding the eigenvalues gives the trace of A. Multiplying the eigenvalues gives the deter- 
minant of A. We check the two possibilities R and C, to see when Re (A) < 0. 


R_ If Ay < Oand 2 < 0, then trace = A, + A2 < Oand determinant = \,;\2 > 0 
C Ifs<0in\=s+iw, then trace = 2s <0 and determinant = s? + w? > 0 


Both cases give the same stability requirement: Negative trace and positive determinant. 


a b § tracee=a+d <0 
a=|§ a is stable exactly when ae perce: (11) 


It was the quadratic formula that led us to the possibilities R and C, real or complex. 
Remember the equation det (A — AJ) = 0 for the eigenvalues : 


det lee fo = \? — (a +d) + (ad — bc) = X? — (trace) A + (det) = 0. 
The quadratic formula for the two eigenvalues includes an all-important square root: 


Real or complex A= i [trace + y/ (trace)? — 4(det) . (12) 


wo 


The roots are real (case R) when (trace)? > 4 (det). The roots are complex (case C) when 
(trace)? < 4 (det). The line between R and C is the parabola in the stability picture: 


(Trace)? = 4 (det) a Es is stable : ; is unstable 


Stable matrices only fill one quadrant of the trace-determinant plane: trace < 0, det > 0. 
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Stability determinant D > 0 Examples 
picture r a 
; = : stable 
“tl both ReA <0 | bothReA>0 rr si 
“ tabl 
\ stable unstable P 3 3 unstable 
\ y L 
* ‘ J is 
R = c C WA R : i unstable 
both A < 0 Is 4 botha>o os 
Send 2 2 AD (0. 7 
2 rare unstable 
stable Pe ep a Bs ean ‘| neutral 


det < 0 means \; < 0 and Az > 0: unstable 


Second Order Equation to First Order System 


Chapter 2 of this book studied the second order equation y” + By’ + Cy = 0. Often 
this is oscillation with underdamping. The solutions y = e(¢+™)* and e(¢-™)* come from 
the quadratic equation s? + Bs +C =0, when we search for solutions y = et, 
If B? is larger than 4C, then the roots are real and the solutions are e*!’ and e%?%. 
In that overdamped case, the oscillations are gone. 

I want to show you exactly the same solutions in the language of y’ = Ay. Instead of 
one equation with y” we will reach two equations with y’ = (y1',y2'). You have seen 


the key idea before: The original y and y' become y, and yj. Then the matrix A is a 
companion matrix. 


" ; - nm) [fy]. 0 1 7 (oe 
yisyscvno [*)'=[%]=[_2 2][4]ean oo 


It is important to see why the roots s; and sz are also the eigenvalues A; and Ag. 
The reason is, these are still the roots of the same equation s?+ Bs +C = 0. Only 
the letter s is changed to A. 


pie eee arts ae 


ees 2 
GC px |=? TBAT OST. (14) 


This was foreshadowed when we drew the six solution paths in Section 3.2: Sources, Sinks, 
Spirals, and Saddles. Those pictures were in the y, y’ plane (the phase plane). Now the 
same pictures are in the y;, y2 plane. I specially want to show you again the trace and 
determinant of A and the whole new-old understanding of stability. 


0 1 : 
eee | has trace = —B and determinant = C. 


First the test for real roots of s? + Bs + C = 0 and for real eigenvalues of A: 


R_ Real roots and real eigenvalues B?>4C (trace)? > 4(det) 
C Complex roots and eigenvalues \=a+ iw B? < 4C (trace)? < 4(det) 


In the picture, the dashed parabola T? = 4D separates real from complex: R from C. 
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More than that, the highlighted quadrant displays the three possibilities for damping. 
These are all stable: B > O and C > 0. 


Underdamping Complex roots B* <4AC above the parabola 

Critical damping Equal roots B* = 4AC onthe parabola 

Overdamping Real roots B? > 4AC below the parabola 
The undamped case B = 0 is on the vertical axis: eigenvalues tiw with w? = C. 


Everything comes together for 2 by 2 companion matrices. The eigenvectors are attractive 
too: 


, Mt 
a= | >| “=| 5, | agree with |e =| ee] =| 3 att=0. (15) 


The same method applies to systems with n oscillators. B and C become matrices. The 
vectors y and y’ have n components and the joint vector z = (y, y’) has 2n components. 
The network leads to n second order equations for y, or 27n first order equations for z : 


te 0 I 
y+ By'+Cy=0 2/= | Aa = _cC _B | Bf SiAz = 0516) 
Eigenvectors give the null solutions y,,. Real problems come with forcing terms q = Fe**. 
Here I make just one point about repeated roots and repeated eigenvalues: 
If A1 = Az there is no second eigenvector of the companion matrix A. That matrix 


can’t be diagonalized and the eigenvector method fails. The next section will succeed with 
e“*, even without a full set of eigenvectors. 


Higher Order Equations Give First Order Systems 


A third order (or higher order) equation reduces to first order in the same way. Introduce 
derivatives of y as new unknowns. This is easy to see for a single third order equation 
with constant coefficients : 


yl” + By” + Cy' + Dy =0 (17) 


The idea is to create a vector unknown z = (y,y’,y”). The first component y satisfies a 
very simple equation: its derivative is the second component y’. Then the matrix below 
has 0,1,0 in its first row. Similarly the derivative of y’ is y”. The second row of the 
companion matrix is 0,0, 1. The third row contains the original differential equation (17): 


/ 


y 0 1 0 y 
Sile= Jae y | = 0 0 1 Veal te (18) 
" =p) =Co =B y" 


Companion matrices have 1’s on their superdiagonal. We want to know their eigenvalues. 
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Eigenvalues of the Companion Matrix = Roots of the Polynomial 
Start with the eigenvalues of the 2 by 2 companion matrix : 


r if 


det (A — AI) = det | yee en 


|=at+BA+c=o. (19) 


Compare that with substituting y = e** in the single equation y” + By’ + Cy = 0: 
Mew Bie’ Ce” sives 184 BAC S06. (20) 


The equations are the same. The \’s in special solutions y = e** are the same as the 
eigenvalues in special solutions z = e**a. This is our main point and it is true again for 
3 by 3. The eigenvalue equation det(A — AJ) = 0 is exactly the polynomial equation 
from substituting y = e** in y’” + By” + Cy’ + Dy =0: 
—r 1 0 
det 0 -A 1 = -(24+ BX? +CrA+ D) =0. (21) 
—-D -C -B-d 


The eigenvectors of this companion matrix have the special form x = (1,\,A7?). 
Fourth order equations become z’ = Az with z = (y,y’,y”, y’”). 4 by 4 companion matrix, 
eigenvalues from \* + BA? + CA? + DA+ E =0. 


Example4 (\ — 2)? = \?—4+4=0 comesfrom y” — 4y’+4y =0: 


Companion matrix A 


= 0 1 B Dect! 
Repeated root \ = 2,2 A=| | det (A — AJ) = A* — 404-4. 


—4 4 


A = 2 must have one eigenvector, and it is = (1,2). There is no second eigenvector. 
The first order system z’ = Az and the second order equation y” — 4y’ + 4y = 0 are 
in (the same) trouble. The only pure exponential solution is y = e”°. 

The way out for y is the solution te?’. It needs that new form (including 2). 
The way out for z is a “generalized eigenvector” but we are not going there. 


= REVIEW OF THE KEY IDEAS #® 


1. The system y’ = Ay is linear with constant coefficients, starting from y(0). 
2. Its solution is usually a combination of exponentials e* times eigenvectors = : 
n independent eigenvectors y(t) = cye™*a, +--+» + ene an. 
. The constants c),.. ., Cn are determined by y(0) = c) x1 +--++Cp2p. This is Ve! 


. y(t) approaches zero (stability) if every \ has negative real part: Re A < 0. 


. 2 by 2 systems are stable if trace T = a+ d < Oanddet D= ad —bc>0. 


Aa un > Ww 


. y” + By’ + Cy = 0 leads to a companion matrix with trace = — B and det = C. 
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Problem Set 6.3 


1 Find all solutions y = c,e*!'a, + cpe*2*ay to y! = | : : | y. Which solution 


starts from y(0) = c)x%1 + co@2 = (2,2)? 


2 Find two solutions of the form y = e**a to y’ = | : a 


3 If a £d, find the eigenvalues and eigenvectors and the complete solution to y’ = Ay. 
This equation is stable when a and d are 


py |lieare<O 
Fee NO aa: 
4 Ifa -—b, find the solutions e**a, and e*?'a toy! = Ay: 


A= | al Why is y’ = Ay not stable? 


5 Find the eigenvalues Aj, Az, Ag and the eigenvectors x1, %2, x3 of A. Write 
y(0) = (0,1,0) as a combination cya + co%q + c3a%3 = Ve and solve y’ = Ay. 
What is the limit of y(t) as t —> co (the steady state) ? Steady states come from = 0. 


=]; 1 0 
A= Lia 2, 1 
0 P= 


6 The simplest 2 by 2 matrix without two independent eigenvectors has A = 0,0: 


f 01 1 
eal emer re ¥1 | hasa first solution | 4! | = e% ‘ 
ya 0 O Y2 ID) 0 


Find a second solution to these equations y,;’ = y2 and y2’ = 0. That second solution 
starts with ¢ times the first solution to give y; = t. What is yo? 


Note A complete discussion of y’ = Ay for all cases of repeated \’s would involve 
the Jordan form of A: too technical. Section 6.4 shows that a triangular form is 
sufficient, as Problems 6 and 8 confirm. We can solve for yo and then yj. 


X 


7 Find two )’s and x’s so that y = e**ax solves 


dy _ 4 3 
aa | 


What combination y = c,e*!*a, + coe>2targ starts from y(0) = (5, —2) ? 
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8 


9 


10 


11 


12 


13 


14 
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Solve Problem 7 for y = (y, z) by back substitution, z before y: 
dz dy 
Solve ame from z(0) = —2. Then solve a 4y + 3z from y(0) = 5. 


The solution for y will be a combination of e“* and e*. The ’s are 4 and 1. 


(a) If every column of A adds to zero, why is \ = 0 an eigenvalue ? 


(b) With negative diagonal and positive off-diagonal adding to zero, y’ = Ay 
will be a “continuous” Markov equation. Find the eigenvalues and eigenvectors, 
and the steady state ast + oo: 


dy =2: <3 : 4 : 
ap — = 2? 
Solve ; 9 | y with y(0) el . What is y(oo) ? 


A door is opened between rooms that hold v(0) = 30 people and w(0) = 10 people. 
The movement between rooms is proportional to the difference v — w: 

du 4 dw 

—=w-v and —=v-w 

dt dt 
Show that the total v + w is constant (40 people). Find the matrix in dy/dt = Ay 
and its eigenvalues and eigenvectors. What are v and w at t = 1 andt = oo? 


Reverse the diffusion of people in Problem 10 to dz/dt = —Az: 


du 4 dw 

—=v-w an St Us 

dt dt 
The total v + w still remains constant. How are the \’s changed now that A is changed 
to — A? But show that v(t) grows to infinity from v(0) = 30. 


A has real eigenvalues but B has complex eigenvalues: 


A=(‘ | B=(i = (a and b are real) 


Find the stability conditions on a and 6 so that all solutions of dy/dt = Ay 
and dz/dt = Bz approach zero as t + oo. 


Suppose P is the projection matrix onto the 45° line y = x in R?. Its eigenvalues are 
1 and 0 with eigenvectors (1,1) and (1,—1). If dy/dt = —Py (notice minus sign) 
can you find the limit of y(t) at t = oo starting from y(0) = (3,1)? 


The rabbit population shows fast growth (from 6r) but loss to wolves (from —2w). 
The wolf population always grows in this model (—w? would control wolves): 

d d 

= = or — 2w and arty. 
Find the eigenvalues and eigenvectors. If r(0) = w(0) = 30 what are the populations 
at time t? After a long time, what is the ratio of rabbits to wolves? 
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15 


(a) Write (4,0) as a combination c;x1 + c2%2 of these two eigenvectors of A: 


a olle}=*Ee] [4 ol Le}=—[L4], | 


(b) The solution to dy/dt = Ay starting from (4,0) is cea, + coe a. 
Substitute e” = cost + isint and e~“ = cost — isint to find y(t). 


Questions 16-19 reduce second-order equations to first-order systems for (y, y’). 


16 


17 


18 


19 


20 


21 


Find A to change the scalar equation y” = 5y’ + 4y into a vector equation for 


y=(y,y’): 

dy _|y’ y 

—_—_ = = = A 3 

dt | y! y 
What are the eigenvalues of A? Find them also by substituting y = e* into 
y" = By! + 4y. 


Substitute y = e* into y” = 6y' — 9y to show that \ = 3 is a repeated root. 
This is trouble; we need a second solution after e°t. The matrix equation is 


ele ]=[-5 JLo] 


Show that this matrix has \ = 3, 3 and only one line of eigenvectors. Trouble here too. 
Show that the second solution to y’” = 6y’ — 9y is y = te**. 


(a) Write down two familiar functions that solve the equation d*y/dt? = —9y. 
Which one starts with y(0) = 3 and y’(0) = 0? 


(b) This second-order equation y” = —9y produces a vector equation y’ = Ay: 


v-(o]BLSI-[8 Lela 


Find y(t) by using the eigenvalues and eigenvectors of A: y(0) = (3, 0). 


If c is not an eigenvalue of A, substitute y = ev and find a particular solution to 
dy/dt = Ay — eb. How does it break down when c is an eigenvalue of A? 


A particular solution to dy/dt = Ay — bis y, = A~'b, if A is invertible. The 
usual solutions to dy/dt = Ay give y,,. Find the complete solution y = Uo Yrs 


[3 


Find a matrix A to illustrate each of the unstable regions in the stability picture : 


dy _ dap =! ps 0 


(a) Ay < OandAg >0 + (b) A, >OandA2 >0 (c)A=axtibwitha > 0. 
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22 


23 


24 


25 


26 


27 


28 


Which of these matrices are stable ? Then Re \ < 0, trace < 0, and det > 0. 
—2 -3 —l1 -2 —1 2 
eee coe ee | 
For an n by n matrix with trace(A) = T and det(A) = D, find the trace and 


determinant of —A. Why is z’ = —Az unstable whenever y’ = Ay is stable ? 


(a) Fora real 3 by 3 matrix with stable eigenvalues (Re \ < 0), show that trace < 0 
and det < 0. Either three real negative \ or else A2 = 1 and 3 is real. 


(b) The trace and determinant of a 3 by 3 matrix do not determine all three 
eigenvalues ! Show that A is unstable even with trace < 0 and determinant < 0: 


1 2 3 
A=]0 1 4 
0 0 —5 


You might think that y’ = —A*y would always be stable because you are squaring 
the eigenvalues of A. But why is that equation unstable for A = . : ? 


Find the three eigenvalues of A and the three roots of s* — s? + s — 1 = 0 (including 
s = 1). The equation y’” — y” + y’ — y = 0 becomes 


PTO ales 
a Ss |0> “00 4 y! ot z/ = Az. 
ie eae a 


Each eigenvalue ) has an eigenvector x = (1, \, A). 


Find the two eigenvalues of A and the double root of s? + 6s + 9 = 0: 
y ) Oly|y 
" / ae =s en 
y +6y +9y=0 becomes | ¥, | a allel OTz = Az: 
The repeated eigenvalue gives only one solution z = e**. Find a second solution z 


from the second solution y = te’. 


Explain why a 3 by 3 companion matrix has eigenvectors x = (1,A,A?). 


First Way: If the first component is 7; = 1, the first row of Ax = Ax gives the 
second component x2 = ____. Then the second row of Ax = Az gives the third 
component x3 = ?. 


Second Way: y' = Ay starts with yf = yo and yz = y3. y = ea solves 
those equations. At t = 0 the equations become Ax, = x2 and 
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29 


30 


31 


Find A to change the scalar equation y’’ = 5y’ — 4y into a vector equation for 


z=(y,y'): F ) 
e-[4]-[ ][y]-« 


What are the eigenvalues of the companion matrix A? Find them also by substituting 
y = e* into y” = 5y! — 4y. 


(a) Write down two familiar functions that solve the equation d?y/dt? = —9Qy. 
Which one starts with y(0) = 3 and y/(0) = 0? 


(b) This second-order equation y” = —9y produces a vector equation z’ = Az: 


ele] aoLed=[s oll oe 


Find z(t) by using the eigenvalues and eigenvectors of A: z(0) = (3,0). 


(a) Change the third order equation y'” — 2y" — y’ + 2y = 0 to a first order system 
z’ = Az for the unknown z = (y,y’, y”). The companion matrix A is 3 by 3. 


(b) Substitute y = e* and also find det (A — AJ). Those lead to the same )’s. 


(c) One root is \ = 1. Find the other roots and these complete solutions : 


y= cye1* + coer?! + cers! z= Cye™ ta, + Coe**ae + Cze**ar3. 
These companion matrices have \ = 2,1 and \ = 4, 1. Find their eigenvectors : 


A= be : | and B= | = } | Notice trace and determinant! 
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6.4 The Exponential of a Matrix 


This section expresses the solution to a system dy/dt = Ay in a different way. Instead 
of combining eigenvector solutions e**a, the new form uses the matrix exponential eAt. 


Solution to y’ = Ay y(t) = eAty(0) (1) 


This matrix e4¢ matches e® when n = 1: the scalar case. For matrices, we can still 


write the exponential as an infinite series. In one way this is better than depending on 
eigenvectors—but maybe not in practice: 


Advantage We don’t need n independent eigenvectors for elt, 
Disadvantage An infinite series is usually not so practical. 


The new way produces one short symbol et for the “solution matrix.” Still we 
often compute in the old way with eigenvectors. This is like a linear system Av = b, 
where A~! is the solution matrix but we compute v by elimination. 

For large matrices, y’ = Ay uses completely different ways — often finite differences. 


The Exponential Series 


A 


The most direct way to define the matrix e lis by an infinite series of powers of A: 


Matrix exponential eAt 74 Att 5(Aty? tS S > (At)? /n! (2) 


2 
ll 
° 


This series always converges, like the scalar case e* in Chapter 1. et is the great 
function of matrix calculus. The quickly growing factors n! still assure convergence. 
The two key properties of e* continue to hold when a becomes a matrix A: 


1. The derivative of eAt is AcAt 2: (eAt) (eAT) = eAlt + T) 


Property 1 says that y(t) = eAty(0) has derivative y’ = Ay. And y(t) starts correctly 


AQ At 


from y(0) at t = 0, since e““” = J from equation (2). So e“”y(0) solves y’ = Ay. 


Suppose we set J’ = —t in Property 2. Thent+7T=0: 
The inverse of eAt is eAt et AT eee when T is —t. (3) 
et has properties 1 and 2 even if A cannot be diagonalized. When A does have n 


independent eigenvectors, the same eigenvector matrix V diagonalizes A and e4t. The 
next page shows that eAt — yeAty—!: this is the good way to find eft, 
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Assume A has n independent eigenvectors, so it is diagonalizable. Substitute A = VAV~! 
into the series for e4*. Whenever VAV~!V AV —! appears, take out V~!V = I. 


Use the series eft —1+VAV—1t+3(VAV—-1t)(VAV—14) + --- 
Factor out Vand V~! =V([I+At+ $(At)?+---]V7 (4) 
Diagonalize e4¢ eAt = yeAty—1, 


The numbers e%*' are on the diagonal of e‘. Multiply Ve‘¢V—!y(0) to see y(t). 


Second Proof e‘t has the same eigenvectors x as A. The eigenvalues of eA are eo: 


/ ] 
A"2 = "2 leadsto e4te = (1 +At+ 5 (At) +:. +) x= eg. (5) 


So the same eigenvector matrix V diagonalizes both A and et The eigenvalue matrix for 
et is diag (e%14,... ,e%n"). This is exactly et. AgaineAt = VeAty-1. 


The eigenvalues of the inverse matrix e— At are e~™", This is 1 if e® as expected. 
: , 0 1 ; : : 
Example 1 = The rotation matrix A = eat) has eigenvalues A; = 72 and Ag = —2: 


1 1 0 e 1 a —sint cost 


it oS : 
gat aay Myst = E il ki «| 1 F | a cost | 6 


This produces eAt without adding up an infinite series. We could also begin the series: 


ect) ‘ee Life 1ofOe 43 | eee a eee 
+ += o|t+t=Ics = as Bellis 
0 1 =to“0) 2| 0 -t 6 |t 0 —t+ at 1-30? 
The cosine series starts with 1 — 5t?. The sine series starts with t — 3t°. The full series for 


eft gives the full series for cos ¢ and sint: very exceptional. 


Example 1 continued What is the solution to dy/dt = Ay with y(0) = (1,0)? 


Answer We know that y(t) = (y1, y2) is e‘Aty(0), and equation (6) gives et: 
yi’ = Y2 de i ee cost sint oe |e cost (7) 
yo’ = —y1 yo(t) | | —sint cost 0} | —sint | ° 

Right! The derivative of cost is —sint. The derivative of yg = —sint is —cost. 


The equations y’ = Ay are satisfied. When t = 0, we start correctly at y(0) = (1,0). 


This solution is important in physics and engineering. The point y(t) is on the unit circle 
ye + ys = cos?t + sin?t = 1. It goes around the circle with constant speed. 
The second derivative (acceleration) is y’ = (—sint,—cost) because A? = —IJ. This 
vector y” points in to the center (0, 0). We have a planet going in a circle around the sun. 


364 Chapter 6. Eigenvalues and Eigenvectors 


Example 2 Suppose A is triangular but we can’t diagonalize it (only one eigenvector): 


ile Y1 yj Yack 2Y2 
(Ay = : 8 
veav=[oi|[R] Bihig  @ 


A has no invertible eigenvector matrix V. How to find y(t) without two eigenvectors ? 


Solution Since A is triangular, back substitution will solve y’ = Ay. Begin by solving 
the last equation y2’ = y2. Then solve for y : 


yo(t) =e'yo(0) Then ys’ = 1 + yo = ys + €'y2(0) 
That equation for y; has a source term q(t) = e’y2(0). Chapter 1 found the solution y; (t) : 


t t 
e’yi(0) + / e'Sq(s) ds = e*y;(0) + e’y2(0) | ds = e*y1(0) + te*y2(0). (9) 
0 0 


At last we have a reason for the extra factor t. The natural growth rate of y; is also 
the growth rate of y2. This leads to “resonance” in yi’ = yi + y2, and the growth of te’ 
is extra fast. We saw resonance with te*’ in Chapter 2. Now we are seeing the t in e 


t)= e'y,(0) + te*y2(0 Veh det 
yn ) J yn( ) e evel ) means that eAt = | Cre 
ya(t) = er yo(0) 


Example 2 (using e“*) _ For this triangular matrix A, we can also add the series for elt. 


(10) 


1 1 
et = 14 At+ ataty + (ADP the 


io tt Lae cote Te se i 
=| 3 mee rl+alo 2 l+é[o po yee aD 
=|2 & because (raph ats es 
~10 et ba 2 


All the powers of a triangular matrix are triangular. So the diagonal entries of A give the 
diagonal entries of e ¢ Those are the eigenvalues of e © and here they are both e°. 


Source Term in y’ = Ay+q 


We can solve y’ = ay + q for a single equation (1 by 1). Now allow a matrix A: 


at 


—1 
z q New ou = Ay+q (12) 


Change a to A! For constant q, that is the only change in the formula for y: 


Old y(t) = €%y(0) + 
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The derivative of y produces Ay, except for the constant A~!q with derivative = zero. 
But this term A~‘g disappears safely in Ay + gq, because -AA~'q + q = 0. 
Chapter 1 was built on the growth factor e in the integral for y,. Now it is e“?! 


A(t — 8) from time s to time t. 


t—s). 


Principle = Each input q(s) has growth factor e 
For constant A, the growth (or decay) over time t — s is just multiplication by eA 


t 
y'=Ay+q(t) issolvedby y(t) = e4ty(0) + f eA(t — 5) q(s) ds. (14) 


0 


Similar Matrices A and B 
To end this section, I will solve y’ = Ay in one more way. Same result, new approach. 
Change of variables. Write y = Vz to change from y(t) to the new variable z (t). 
d d. d. 
ae Ay becomes v= =AVz_ whichis 7 =V-!AVz. (15) 


The matrix A has changed to B = V~! AV. Then the solution for z involves eB. 


BV -tAV z' = Bz produces z(t) = e?*z(0) (16) 
Changing back to y = Vz, that solution becomes y(t) = VeP*z(0) = VePty-1y(0). 
The exponentialof A=VBV~! is e4¢=VePy-. (17) 


Special case : When V is the eigenvector matrix, B is the eigenvalue matrix A. 


Here is my point. Equation (17) is true for any invertible matrix V. Choosing the 
eigenvector matrix of A makes B diagonal; in fact B = V-'AV = A. This is the 
outstanding choice for V, to produce B = A when A has n independent eigenvectors. 
But any invertible V is now allowed, and we have a name for B : similar matrix. 


I can quickly prove that eigenvalues stay unchanged. Eigenvectors change tou = V—12:: 
If Ax =x then V-'Axv=AV-'a2 whichis V-1'AVu=Bu=Au. (18) 


By allowing all invertible V, we have a whole family of matrices B = V~!AV. All are 
similar to A, all have the same eigenvalues as A, only the eigenvectors change with V. 


In case A cannot be diagonalized, a good choice of V makes B upper triangular. 
V is not easy to compute, but it greatly simplifies the problem. Example 2 showed how 
z(t) comes from back substitution in z’ = Bz. Then y(t) = Vz(t) solves y’ = Ay 
without n independent eigenvectors of A. 
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Fundamental Matrices (Optional Topic) 


A linear system dy/dt = A(t)y is completely solved when you have n independent 
solutions y,(t) to y,,(t). Put those solutions into the columns of an n by n matrix M(t): 


dM 
Fundamental matrix /(t) = yilt)---yalt) | has —~ = AM(e). (19) 


Every column of dM/dt has dy/dt = Ay. All columns together give dM/dt = AM. 
“Linear independence” means that M is invertible. The determinant of M is not zero. 
This determinant W(t) is called the “Wronskian” of the n solutions in the columns of M: 


(20) 


The beautiful fact is this: If the Wronskian starts from W + 0 at time t = 0, then 
W(t) # 0 for all t. Independence at the start means independence forever. A combination 
y(t) = cy, (t) +--- + cry,(t) can only be zero at time ¢ if it started from y(0) = 0. 
Solutions to y’ = Ay don’t hit zero! So W(t) = 0 requires W(0) = 0, as in this neat 
formula discussed in the Chapter 6 Notes (exponentials are never zero). 


dW 
ra (trace A(t))W andthen W(t) = eJ trace A(t) dt W(0). (21) 
What are M(t) and W(t) for a second order equation y"” + B(t)y' + C(t)y = 0? We 
know how to convert this to a first order system y’ = A(t)y. The vector unknown is 
y = (y,y’) and A(t) is a companion matrix containing —B(t) and —C(t). The two 
independent solutions in the columns of M(t) are (y;, yi’) and (y2, y2"): 


Matrix M(t) = i | Wronskian W(t) = det M = yiy2'—yoyi’. (22) 


Again W(t) 4 0 is the test for y; and yo to be independent. The test is passed for all t 
if W(0) # 0. In the mysterious formula (21), the trace of A(t) is —B(t). 


You will naturally ask: What is this fundamental matrix M(t)? Why are we only see- 
ing it now? One answer is that you already know the growth factor G from Chapter 1: 
M = G(0,t) = exp(f a(t)dt). For systems, you also know M = e“*. That is the perfect 
answer when A is constant. e“* is the best possible M (t) because it starts from M(0) = I. 

It is often hard to find M(t) when the matrix A depends on ¢ (then nothing is easy). 
We know that y’ = A(t)y has n independent solutions y(t). But in most cases we don’t 
know what those solutions are. The point of fundamental matrices is that the solution y(t) 
comes directly from M(t), when and if we know M : 


y(t) = M(t)M(0)~*y(0) forany M(t) (23) 


Let me say a little more about constant A and varying A(t), and then stop. 
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Constant A with n independent eigenvectors in V_ We know n solutions y = ez: 


Put those y’s into M(t) = [e**a, e'a ... ern'an] = Vet, 


How does this differ from e4¢? You can see everything at t = 0, when this M(t) is V. 
If you want the fundamental matrix that equals J at t =0, just multiply by M(0)-1=V7?: 
When A = VAV~1?, the best fundamental matrix is M@ = VeAty—! which is e4*. 
Time-varying A(t) with time-varying eigenvectors The equation y’ = A(t)y is more 
difficult. The next page shows how the expected solution formula fails. The chain rule 
goes wrong. Finding even one solution y,(t) is a big challenge. The optimistic point 
is that if we can find y,(t), then “variation of parameters” will lead us to yy = C(t)y,. 

Let me focus on a famous equation that has been studied by great mathematicians : 


d 
Bessel’s equation 7) : (24) 


The solutions are Bessel functions of order p. When the order is p = 3 these solutions 
y1 and ye are quite special (the variable ¢ is usually changed to z). 


Pe / 2 ; 
y(z) = — sin x and Yyo(x) = pa go into ua |®, | 


Those are independent solutions and the Wronskian W = y Ys — yoy is never zero. 

The most important Bessel functions have p = 0,1, 2,... and whole books are written 
about these functions. They are not simple! The first and most famous Bessel function is 
y = Jo(z), with order p = 0: 


x? x x 


~ 92 + 92q2 — 224262 
The second solution Yo, independent of Jo, blows up at x = 0. When you divide Bessel’s 


equation (24) by x”, so as to start the equation with y”, you see that its coefficients are 
singular: 1/2 and 1 — p?/zx? also blow up at x = 0: A singular point. 


Jo(x) =1 fee resembles a damped cosine. 


Failure of a Formula 


A single equation dy/dt = a(t)y has a neat solution y = e?“y(0). We choose P(t) as the 
integral of a(t). By the chain rule, dy/dt has the desired factor a(t) = dP/dt. I am very 
sorry to say that y = e”(!)y(0) fails for matrices A(t) and systems y! = A(t)y. 

There is no doubt that the derivative of the integral of time-varying A(t) is A(t). 
Even for matrices, this part is true: 


t 
d dP 
Fundamental Theorem of Calculus H / A(s) ds = aes A(t). (25) 
0 


368 Chapter 6. Eigenvalues and Eigenvectors 


When A is a constant matrix, that integral is P = At and its derivative is A. Then 
the derivative of e4* ig Ae“. This whole section is built on that true statement. We hope 
that the same chain rule will give the answer when A(t) is varying and not constant : 


t 


The derivative of G = exp [40 ds | “should be” A(t)G. Not always! (26) 
0 


When the matrix A(t) is changing with time, the chain rule in (26) can let us down. 
This leaves no simple formula for y(t). How can things go wrong ? 


The difficulty is that e4 times e? may not be the same as e4+®. Problem 7 gives an 
example of A and B. Those matrices do not satisfy AB = BA and this destroys the rule for 
exponents. It is true that e4e? = e4+® when AB = BA, but not here. 

Let me use those matrices in Problem 7 to construct a two-part example : 


y= Bap fort 1 andthen y’=Ay fort>1 (27) 


Our time-varying matrix A(t) jumps from B to A att = 1. The integral of A(t) is P(t): 
t 

Po) = J Avyas = Bt(fort<1) and A(t-—1)4+ B (for t>1). (28) 
0 


But the exponential of P(t) does not solve our differential equation (27) att = 2: 
2 
PO) = [eas =A+B  iscorrectbut y(2)=e“*+?y(0) is wrong. 
0 


The correct answer is y(2) = e*e®y(0). First B then A. The solution is e?‘y(0) 
up to time t = 1, when B changes to A. After t = 1 the solution is e4¢—e?y(0). 


The chain rule in (26) is wrong, because e4e® is different from e412 


= REVIEW OF THE KEY IDEAS &® 


1. The exponential of At is e4¢ = I + At + 4 (At)? + @(At)? +--+ 

2. The solution to y’ = Ay is y(t) = e4*y(0). This is VeA*V—1y(0) if V~? exists. 

3. That solution is the same as cye*!'a, + --- + cne*”*ay, with e = V~!y(0). 

4. The solution to y’ = Ay + q (constant source) is y(t) = eAty(0) + (eAt —I)A7'q. 
5. All similar matrices B = VAV~! (with any V) have the same eigenvalues as A. 


6. If A(t) is time-varying, easy formulas for the fundamental matrix M(t) will fail. 
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= WORKED EXAMPLE #8 


Show that y(t) = e4*y(0) is exactly cye™*a, +++» + cpe ta, if y(0) = Ve. 
Ci 
Step1 Write y(0) = c)a1 +---+ nay. Thisis] a, ++ ap : | =Ve. 


Cn 


Step 2 Starting from an eigenvector x, the solution is y = ce**a. 


Step 3. Add those 7 solutions to get Verte = VeAty—1y(0) = e4ty(0). 


Here are those steps for a triangular matrix A. Suppose y(0) = (5,3). First A and V: 


0 2 


vor wo-[3]-2[5]e[1]-[5 1]L8]- 


Step 2 The separate solutions ce**x from eigenvectors are 2e*x , and 3e?*ao. 


aml i >| has a= 1 and 21 =| 5 | a= 2 and a =| 1 | 


Step 3. The final y(t) = e44y(0) = VeA*V—1y(0) is the sum 2e*ay + 3exo. 


0 1 0 1 
° At ° ° 
Challenge Find e*” for the companion matrices CoO and Pye | 


Their eigenvectors in Ve*'V —! are always (1, ). 


Problem Set 6.4 


1 If Aa = Aza, find an eigenvalue and an eigenvector of e4t and also of —e~4¢, 


2 (a) From the infinite series eAt — [ + At +--- show that its derivative is Ae“. 
(b) The series for et ends quickly if A = | ; ; because A? = ; ; ; 
Find e“* and take its derivative (which should agree with Ae“), 


3 For A = | ; ; with eigenvectors in V = ; , | compute eAt — peAty-1, 


4 Why is e (A+ 3n)t equal to eft multiplied by ert 9 


: aul : : : 
5 Why is e“ not the inverse of e“ ? What is the correct inverse of e4 ? 


n t an 
6 Compute A” = ; . Add the series to find et = eae y) | 
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10 


11 


12 


13 


14 


15 


16 


17 
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Find e“ and e? by using Problem 6 for c = 4 and c = —4. Multiply to show that 
the matrices e4e? and e?e4 and e4+ are all different. 


e=(3 4] a-[} 4] 4+e-[2 8] 


Multiply the first terms J + A + $A? of e4 by the first terms J + B + 3B? of e®. 
Do you get the correct first three terms of e4+? ? Conclusion: e4+* is not always 
equal to (e“)(e”). The exponent rule only applies when AB = BA. 


Write A = [§§] in the form VAV ~!. Find eAt from VeAtV-2, 


Starting from y(0) the solution at time t is e4y(0). Go an additional time t 
to reach et e4ty(0). Conclusion: e“* times e“* equals 


Diagonalize A by V and confirm this formula for eAt by using Vedty—1; 


2t Seer, 
A=(5 a Ate § ee ¢ I At t=O this matrixis 


(a) Find A? and A? and A” for A = | ; : with repeated eigenvalues \ = 1, 1. 
(b) Add the infinite series to find eAt. (The VeAty-1 method won’t work.) 


(a) Solve y’ = Ay as a combination of eigenvectors of this matrix A: 
v=|; | with uo) =| 3 | 
(b) Write the equations as y; = yo and ys = y,. Find an equation for y// with y2 
eliminated. Solve for y;(t) and compare with part (a). 
Similar matrices A and B = V~!AV have the same eigenvalues if V is invertible. 
Second proof det (V~!AV — AI) = (det V~*) (det (A — AZ)) (det V). 
Why is this equation true ? Then both sides are zero when det (A — AI) = 0. 


If B is similar to A, the growth rates for 2’ = Bz are the same as for y’ = Ay. 
That equation converts to the equation for z when B = V~1 AV and z = 


If Ax = Ax # O, what is an eigenvalue and eigenvector of (eAt —I)A1? 


The matrix B = [0 —é] has B? = 0. Find et from a (short) infinite series. 
Check that the derivative of e?? is BeP*. 
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18 Starting from y(0) = 0, solve y’ = Ay + q as a combination of the eigenvectors. 
Suppose the source is g = qi%1 +-:: + 4n2n. Solve for one eigenvector at a time, 
using the solution y(t) = (e*’ — 1)q/a to the scalar equation y’ = ay + q. 


Then y(t) = (e4¢ — 1) A~!q is a combination of eigenvectors when all A; 4 0. 


19 Solve for y(t) as a combination of the eigenvectors x; = (1,0) and w2 = (1,1): 
fees Yj 1 1 Yi d 4 yi(0) =0 
vmauea |B] -[oa][e els] Romo 
20 Solve y’ = Ay = ; 5 | y in three steps. First find the \’s and x’s. 


(1) Write y0) = (3,1) as a combination c)x1 + cox2 
(2) Multiply c; and cz by e*?* and e?°. 
(3) Add the solutions cye**a, + cge*?!arg. 


21 Write five terms of the infinite series for e4*. Take the ¢ derivative of each term. Show 
that you have four terms of Ae“*. Conclusion: e4*y(0) solves dy/dt = Ay. 


Problems 22-25 are about time-varying systems y’ = A(t)y. Success then failure. 


22 Suppose the constant matrix C has Ca = Az, and p(t) is the integral of a(t). 
Substitute y = e%?()x to show that dy/dt = a(t)Cy. Eigenvectors still solve 
this special time-varying system: constant matrix C' multiplied by the scalar a(t). 


23 Continuing Problem 22, show from the series for M(t) = e?° that dM/dt = 


a(t)CM. Then M is the fundamental matrix for the special system y’ = a(t)Cy. 


If a(t) = 1 then its integral is p(t) = t and we recover M = e@*. 


2 
24 ~=‘The integral of A = ; A iso ‘ 3 } The exponential of P is 
et t(e’—1) : : rae: 
c= 0 1 . From the chain rule we might hope that the derivative 
of ce? is P’eP) = Ae?P(). Compute the derivative of e? and compare with 


the wrong answer Ae“), (One reason this feels wrong: Writing the chain rule as 
(d/dt)e? = e? dP/dt would give e” A instead of Ae”. That is wrong too.) 


25 ‘Find the solution to y’ = A(t)y in Problem 24 by solving for y2 and then yj : 


dy, /dt}] [1 2t] fm y (0) 
Solve eles ae starting from wo(0) |: 


Certainly y2(t) stays at y2(0). Find y(t) by “undetermined coefficients” A, B,C’: 
Yi = y1 + 2ty2(0) is solved by y1 = yp t+ Yn = AL + Bt Ce’. 


Choose A, B, C to satisfy the equation and match the initial condition y; (0). 


The wrong answer in Problem 24 included the incorrect factor tet ine? . 
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6.5 Second Order Systems and Symmetric Matrices 


This section solves a differential equation that is crucial in engineering and physics : 


Oscillation equation 
dt? 


+ Sy =0. (1) 


Since this is second order in time, we need two vectors as initial conditions at t = 0: 


d 
Starting position and starting velocity y(0) and v(0) = = (0) are given. 
If y has n components, we have n second order equations and 2n initial conditions. 
This is the right number to find y(t). Allow me to say this early: The oscillation 
equation (1) is the most basic form of the Fundamental Equation of Engineering. 


The more general equation includes a damping term Bdy/dt and a forcing term 

F cosQt. Those give damped forced oscillations, where equation (1) is about “free” 

oscillations. For one mass and one equation, Chapter 2 took that step to damping and forcing. 

Now we have n masses and n equations and three n by n matrices M,B,K. 

: dy  p,dy , 

Fundamental Equation M Te +B 7: + Ky = Fost. (2) 

The mass matrix is M, the stiffness matrix is K. Those are the pieces we always see and 

always need. When the damping matrix B and the forcing vector F’ are removed, that takes 
us to the heart of the fundamental equation: free oscillations. 


Mass and stiffness matrices My"”+Ky=0. (3) 


The matrix S in equation (1) is M~!K. Its symmetric form is M~!/2kK M~1/2, In many 
applications the mass matrix M is diagonal. 


If we look for eigenvector solutions y = e“+a, the differential equation produces 
Ka = w?Mza. This “generalized” eigenvalue problem has an extra matrix M, 
but it is not more difficult than Sa = Aw. The MATLAB command is eig(,M). An 
essential point is that the eigenvalues are still real and positive, when both M and K are 
positive definite. Positive eigenvalues and positive energy are the key to Chapter 7. 


When the forcing term is a constant F’, the damping brings us to a steady state y,,. 
Then the time dependence is gone; those derivatives dy/dt and d*y/dt? are zero. 
The external force F is balanced by the internal force K yo. The system is in equilibrium : 


Steady state equation KYoo = F = constant. (4) 


The central problem of computational mechanics is to create the stiffness matrix K and 
force vector F. Then the computer solves My” + Ky = 0 and Ky = F. For large 
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problems, the finite element method is now the favorite way to take those steps. 
This is a sensational achievement by the collective efforts of thousands of engineers. ! 


Solution by Eigenvalues 


We want to solve y” + Sy = 0. This is a linear system with constant coefficients. Our 
solution method will be the same as for y’ = Ay. We use the eigenvectors and eigenvalues 
of S' to find special solutions, and we combine those to find the complete solution. 

Each eigenvector of S' leads to two special solutions to y” + Sy = 0: 


Two solutions If Sx = Ax then y(t) = (coswt)w and y(t) = (sinwt)x. (5) 
The “frequency” w is VX. Substitute y = (cos wt)a into the differential equation: 
A = w? and Sx = wx y" + Sy = —w?(coswt)x + S(coswt)r =0. (6) 


When cos wt is factored out, we see the requirement on x. It must be an eigenvector of S. 
We expect n eigenvectors (normal modes of oscillation). The eigenvectors don’t interact. 
That is their beauty, each one goes its own way. And each eigenvector gives us two solutions 
from (cos wt)x and (sinwt)a, so we have 2n special solutions. 

A combination of those 2n solutions will match the 2n initial conditions (n positions and 
n velocities at t = 0). This determines the 2n constants A; and B; in the complete solution 
toy” + Sy=0: 


Complete solution y(t) = >) (A; cos JA, t + B; sin VA; t) 2. (7) 
i=l 


Since sin0 = 0, it is the A; that match the vector y(0) of initial positions. It is the 
B; that match the vector v(0) = y’(0) of initial velocities. 


Example 1 Two masses are connected by three identical springs in Figure 6.3. 
Find the stiffness matrix S and its positive eigenvalues Ay = w? and Ay = w%. If the 
system starts from rest, with the top spring unstretched (y;(0) = 0) and the lower 


mass moved down (y2(0) = 2), find the positions y = (yi, y2) at all later times: 


d?y : ik!) ree wd ko 
ma + Sy=0 with v0) =| > | and y’(0) = o |: 
y(t) has eigenvectors 21,22 times cosine and sine. Four conditions for A;, Ag, Bi, Bo. 


Solution Construct the matrix S' that expresses Newton’s Law my” + Sy = 0. The 
acceleration is y”’, and the force is —Sy. 


'The finite element method is a key part of my textbook on Computational Science and Engineering. 
The foundations of the method and the reasons for its success are developed in An Analysis of the Finite 
Element Method (also published by Wellesley-Cambridge Press). 
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What force F' is acting on the upper mass ? The stretched top spring is pulling that mass 
up. The force is proportional to the stretch y,. This is Hooke’s Law F = —ky}. 

The middle spring is connected to both masses. It is stretched a distance yo — y1. 
(No stretching if yg = yj, the spring would just be shifted up or down.) The difference 
Y2 — yi produces spring forces k(y2 — yi), pulling mass 1 down and mass 2 up. 

The bottom spring with fixed end is stretched by 0 — ye, so the force is —ky2. 


F = maat the upper mass —ky, + k(y2 — yi) = myf! 


F = maat the lower mass —k(y2 — yi) — ky2 = mys! 


These equations —Sy = my” or my” + Sy = O have a symmetric matrix S. Take 


kam = 1: 
d? 2 -1 0 
" Y1 Y1 
Sy ==> = , 8 
es | |+[ 3 2 | | as i 8) 
The modeling part is complete, now for the solution part. The eigenvalues of that 
matrix are A; = 1 and Ag = 3. The trace is 1+ 3 = 4, the determinant is (1)(3) = 3. 
The first eigenvector 2; = (1,1) has the springs moving in the same direction in 
Figure 6.3. The second eigenvector 2 = (1,—1) has the springs moving oppositely, 
with higher frequency because w5 = Az = 3. 
Formula (7) for y(t) becomes a combination of eigenvectors times cosines : 


Solution — = A, (cos V1t) | + Ae (cos V3t) a hs (9) 


I removed B, sint and Bo sin V3t because the example started from rest (zero velocity). 
At time t = 0, cosines give position y(0) and sines give velocity v(0). 


springs §5 At t=0 «= |1| z= |_1| 


Figure 6.3: The masses oscillate up and down, y(t) combines (cos t) x1 and (cos V3t) xo. 
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The final step is to find A, and A, from the initial position y(0) = (0,2): 


Initial condition A, A + Ao 3 | — | gives A; =land Ap =~-1. 


Final answer : y1(t) = (cost — cos V/3t) and y2(t) = (cost + cos /3t). The two masses 
oscillate forever. The solution part was easier than the modeling part. This is very typical. 


Symmetric Matrices 
Example 1 led to a symmetric matrix S'.. Many many examples lead to symmetric matrices. 


Perhaps this is an extension of Newton’s third law, that every action produces an equal and 
opposite reaction. We really must focus on the special properties of symmetric matrices, 
because those properties are so useful and the matrices appear so often. 

Eigenvalues and eigenvectors—this is the information we need from the matrix. 
For every class of matrices, we ask about \ and aw. Are the eigenvalues real? Are they 
positive, so we can take square roots in \ = w?? Are there n independent eigenvectors ? 
Are the x’s orthogonal? The example with \; = 1 and Azg = 3 was perfect in all respects: 


S= ra is symmetric positive definite Paptiye Leal Virs 3s Andes 
Fah (Sahay y P Orthogonal z = (1,1), (1, —1) 


Real eigenvalues All the eigenvalues of a real symmetric matrix are real. 


Proof Suppose that Sa = Ax. Until we know otherwise, \ might be a complex number 
and x might be a complex vector. If that did happen, the rules for complex conjugates would 
give Sa = Ax. The key idea is to look at #7 Sx: 


Sis symmetric and real @'Sa =2'S'e = (Sz)' a. (10) 


The left side is @! \x. The right side is za. One side has 4, the other side has 1. 


They multiply @" ax which is not zero—it is the squared length ||? + --- + |an|?. 
Therefore X = X. 
When A = a+ ib equals A = a — ib, we know that b = O and J is real. 


Then the vector x in the nullspace of the real matrix S — XJ can also be kept real. 


Orthogonaleigenvectors | If Sa = A,a and Sy = Agyand \; # Ao. Thena ty = 0. 


Proof Take the dot product of the first equation with y and the second equation with x : 
Use S'=S (Sa)'y=aTSy is \yaly=roQa'y. (11) 


Since \, # Xz, this proves that 2’ y = 0. The eigenvectors are perpendicular. 

Remember: The main goal of eigenvectors is to diagonalize a matrix, A = VAV~?. 
Here the matrix is S and its eigenvectors are orthogonal. We can certainly make them unit 
vectors, so w’a = 1 and x'y = 0. The matrix V with the eigenvectors in its columns 
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has become an orthogonal matrix: V™V = J. The right letter for this orthogonal matrix 
V is Q. The eigenvector matrix V in VAV—! can be orthogonal: QTQ = I. 


Spectral theorem/ Principal axis theorem S = QAQ-? = QAQqt (12) 


In algebra, the eigenvectors are orthogonal. In geometry, the principal axes of an ellipse 
are orthogonal. If the ellipse equation is 2x? — 2xy + 2y? = 1, this corresponds to the 
example matrix S. Its principal axes (1,1) and (1,—1) (eigenvectors) are at +45° and 
—45° from the x axis. The ellipse is turned by +45° from horizontal and vertical axes. 

With repeated eigenvalues, S$ = QAQ? is still correct. Every symmetric 9 has a full 
set of n independent eigenvectors (Chapter 6 Notes) even if eigenvalues are repeated. 

To summarize, QAQ" is a perfect description of symmetric matrices S. Every S has 
those factors and every matrix of this form is sure to be symmetric: (QAQ*™)? equals 


QTTATQT™ which is QAQ™. If we multiply columns of Q times rows of AQ™, we see 
S in anew way (a sum of rank one matrices) : 


Matrices Axa iat 
with rank 1 S=| ay ae we, : = Maa) foes + An@nxl. (13) 
add to S Anvt 


nn 
This is the great factorization S = QAQ, in terms of eigenvalues and eigenvectors. 


Example 2 The eigenvectors (1, 1) and (—1, 1) with A = 16 and 4 give unit eigenvectors 


x1 = (1,1)/V2 and a2 = (—1,1)/V2: 


10 -6 T 1 1 -1 16 1 L: {I 
s=[75 so] e=selt Td” «lala i] 

Those eigenvectors still point in the 45° direction and the 135° direction (90° apart). They 
are the same as in Example 1, because this new S is 6 times the original S, minus 2/. 
Then the new eigenvalues 16 and 4 of S must be 6 times the original 3 and 1, minus 2. 

The eigenvectors in Q are the principal axes of an ellipse 10x? — 12zy + 10y? = 1. 

If I change —6 and —6 off the diagonal to 62 and —6i, the determinant is still 64. 
The trace is still 20 and the eigenvalues are still 16 and 4 (real!). For complex matrices, 
we want a symmetric real part and an antisymmetric imaginary part. Let me explain why. 


Complex Matrices 


Important: The squared length is &'a and not 2a when « has complex components. 
We want |x|? + --- + |an|? because this is a positive number or zero. We don’t want 
zi + --- + 2% because that could be any complex number, and we are looking for 
|\|z||? = length squared > 0. When a component of x is a + bi, we want a? + b? and 
not (a+ bi)?. The length squared of x = (1,71) is ||a||? = 12+ 1? = 2 andnot1? +7? = 0. 
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This changes all inner products (dot products) from «7 y to 'y. Complex vectors 
x and y are perpendicular when @!y = 0. This complex inner product forces us to 
replace the usual transpose by the conjugate transpose (A)T = A*, when A is complex: 


Aj, is Aji Then Ax-y = (Ax)"y =z" A'y =a-A*y. (14) 

MATLAB automatically takes the conjugate transpose to give A*, when you type a’ or A’. 
To keep the row space of A perpendicular to the nullspace, we must use C(A*) for 
the row space. This is the column space of A*, not just the column space of AT. Replace 


every 7 by —72. And an important name: the complex version of a symmetric matrix 
AT™ = Aisa “Hermitian matrix” A* = A. 


Hermitian matrix A;; = Aji Then Axv-y=a-A*y becomes Axv-y=a2- Ay. 


Example 3 This 2 by 2 complex matrix is Hermitian (notice 7 and —2): 


Saeed c 
[2 i]=4 


The determinant is 8 (real). The trace is 6 (the main diagonal of a Hermitian matrix is real). 
The eigenvalues of this matrix are 2 and 4 (both real !). 


Hermitian matrices A = A* have real eigenvalues and perpendicular eigenvectors. 


The eigenvectors of A are x; = (1,2) and v2 = (1,-—2). They are perpendicular: 
x1*a_ = 17 + (-i)? = 0. Divide by V2 to make them unit vectors. Then they are the 
columns of a complex orthogonal matrix Q. The right meaning of “complex orthogonal” 
is Q* = Qu, and the right name when Q is complex is unitary : 


Unitary matrix Q*Q =I The columns of Q are perpendicular unit vectors. 


The great factorization A = QAQ" of real symmetric matrices becomes A = QAQ*. 


Orthogonal Matrices and Unitary Matrices 


We have seen the big theorem: If S' is symmetric or Hermitian, its eigenvector matrix 
is orthogonal or unitary. The real case is S = QAQT = ST™ and the complex case is 
S = QAQ* = S*. The eigenvalues in A are real. 

What if our matrix is anti-symmetric or anti-Hermitian? Then AT = —A or A* = —A. 
The matrix A could even be 7 times S. (In that case A* will be —z times S* which is 
exactly —iS = —A.) Multiplying by 7 changes Hermitian to anti-Hermitian. The real 
eigenvalues of S change to the imaginary eigenvalues 7A of A. The eigenvectors do 
not change: still orthogonal, still going into Q. 


Anti-Hermitian matrices have imaginary eigenvalues and orthogonal eigenvectors. 


ee Et ea Ot) | ae ry eins: 
a4 ae A and a= (° Ale At. X= ti 


Our standard examples are A = 
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Finally, what if our matrix is orthogonal or unitary? Then QTQ = I or Q*Q = I. 
The eigenvalues of Q are complex numbers = e*? on the unit circle. 


If Q*Q = I then all eigenvalues of Q have magnitude |A| = 1. 


The proof starts with Qa = Ax. The conjugate transpose is x*Q* = one Multiply the 
left hand sides using Q*Q = I, and multiply the right hand sides using A = |A|?: 


x*Q*Qa=rax*r\x isthesameas a*x2=|A|?x*x. Then|dA|? = 1 and |A| = 1. 


The eigenvectors of Q, like the eigenvectors of S and A, can be chosen orthogonal. 
These are the essential facts about the best matrices. The eigenvalues of S and A and Q are 
on the real axis, the imaginary axis, and the unit circle in the complex plane. 

In the eigenvalue-eigenvector world, a triangular matrix is not really one of the best. 
Its eigenvalues are easy (on the main diagonal). But its eigenvectors are not orthogonal. 
It may even fail to be diagonalizable. Matrices without n eigenvectors are the worst. 


Symmetric and Orthogonal 


At the end of Chapter 4, we looked at symmetric matrices that are also orthogonal: AT = A 
and AT = A7}. Every diagonal matrix D of 1’s and —1’s has both properties. Then 
every A = QDQ? also has both properties. Symmetry is clear, and a product of 
orthogonal matrices Q and D and Q? is sure to stay orthogonal. 

The question we could not answer was: Does QDQ™ give all possible examples? 
The answer is yes, and now we can see why A has this form—-based on eigenvalues. 

When A is symmetric, its eigenvalues are real. When A is orthogonal, its eigenvalues 
have |\| = 1. The only possibilities for both are AX = 1 and X = —1. The eigenvalue 
matrix A = D is a diagonal matrix of 1’s and —1’s. Then the great fact about symmetric 
matrices (the Spectral Theorem) guarantees that A has the form QAQT which is QDQ?™. 


= REVIEW OF THE KEYIDEAS ® 


1. A real symmetric matrix S has real eigenvalues and perpendicular eigenvectors. 
2. Diagonalization 5 = VAV—! becomes S = QAQ? with an orthogonal matrix Q. 

3. A complex matrix is Hermitian if os = S (often written S* = S): real ’s. 

4. Every Hermitian matrix is S = QAQ’ = QAQ*. Dot products are x+y = xv*y. 

5. All three matrices S and A = 1S = —A* and Q have orthogonal eigenvectors. 


6. Symmetric matrices in y” + Sy =Oand My” + Ky = 0 give oscillation. 
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Problem Set 6.5 
Problems 1-14 are about eigenvalues. Then come differential equations. 


1 Which of A, B, C have two real \’s ? Which have two independent eigenvectors ? 
(onal 7 11 7 -lil 
A=[_4} 7 p=(4i a Cals | 
2 Show that A has real eigenvalues if b > 0 and nonreal eigenvalues if b < 0: 


a=[$ A and A=[t ae 


a0) eae 
3 Find the eigenvalues and the unit eigenvectors of the symmetric matrices 
22 2 i 0 2 
(a) S=|2 0 0 and (b) S=]0 -1 -2 
2 0 0 2—-2 0 


4 Find an orthogonal matrix @ that diagonalizes S = | i 6 7 . What is A? 


5 Show that this A (symmetric but complex) has only one line of eigenvectors : 


A= ; e is not even diagonalizable. Its eigenvalues are 0 and 0. 


AT = Ais not so special for complex matrices. The good property is Vigo 


6 Find all orthogonal matrices from all 2,22 to diagonalize S = Ee Ab 


7 (a) Find a symmetric matrix S' = ‘ : that has a negative eigenvalue. 


(b) How do you know that S' must have a negative pivot? 


(c) How do you know that S can’t have two negative eigenvalues? 


8 If A? = 0 then the eigenvalues of A must be . Give an example with A + 0. 
But if A is symmetric, diagonalize it to prove that the matrix is A = 0. 


9 If A =a + ibis an eigenvalue of a real matrix A, then its conjugate \ = a-— ibis also 
an eigenvalue. (If Aw = Ax then also A® = AZ.) Prove that every real 3 by 3 matrix 
has at least one real eigenvalue. 
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10 


11 


12 


13 


14 


15 


16 
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Here is a quick “proof” that the eigenvalues of all real matrices are real: 


a! Ar 
ale 


False proof Ax =Aax gives «'Axr=dx'ax so \= is real. 


Find the flaw in this reasoning—a hidden assumption that is not justified. You could 
test those steps on the 90° rotation matrix [0 —1; 1 0] with AX =iand a = (i,1). 


Write A and B in the form Aa at a Aotond of the spectral theorem QAQ? : 
31 9 12 


What number 6 in [28] makes A = QAQ™ possible? What number makes A = 
VAV~—! impossible? What number makes A~! impossible? 


This A is nearly symmetric. But its eigenvectors are far from orthogonal: 


Awi| 2 107 has eigenvectors } and [| ? | 
eb Oo = 0 
What is the dot product of the two unit eigenvectors ? A small angle! 
(Recommended) This matrix M/ is skew-symmetric and also orthogonal. Then all its 
eigenvalues are pure imaginary and they also have |A| = 1. They can only be ¢ or —1. 
Find all four eigenvalues from the trace of M: 
0 1 at 1 


1 Sl Op SS 1 : ; 
M= a 4 1 ia can only have eigenvalues 2 or — ?. 


lh aL 1 0 


The complete solution to equation (8) for two oscillating springs (Figure 6.3) is 


y(t) = (A; cost + B; sin t) ; | + (Az cos V3t + Bo sin V3t) Bs | : 


Find the numbers Aj, A2, By, Be if y(0) = (3,5) and y’(0) = (2,0). 


If the springs in Figure 6.3 have different constants ky, k2,k3 then y” + Sy = Ois 


Uppermass  y// + kyyi — ko(y2 — yi) = 0 g= ky tkz —ke 
Lower mass yf’ + ko(yo — y1) + kay2 = 0 a —kp ko tks 


For ky, 1, ke 4,k3 1 find the eigenvalues ) = w* of S and the complete 
sine/cosine solution y(t) in equation (7). 
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17 


18 


19 


20 


21 


22 
23 


24 


Suppose the third spring is removed (k3 = 0 and nothing is below mass 2). With 
ky = 3,k2 = 2 in Problem 16, find S and its real eigenvalues and orthogonal 
eigenvectors. What is the sine/cosine solution y(t) if y(0) = (1,2) gives the cosines 
and y’(0) = (2, —1) gives the sines ? 


Suppose the top spring is also removed (k; = 0 and also k3 = 0). S is singular! 
Find its eigenvalues and eigenvectors. If y(0) = (1,—1) and y’ = (0,0) find y(t). 
If y(0) changes from (1, —1) to (1, 1) what is y(t) ? 


The matrix in this question is skew-symmetric (At = —A). Energy is conserved. 
OG! 2=% Y1 = cy2 — bys 
dy _ 0 a 
a bee aly or Ya = ays — Cys 
b -a O Yg = by, — ays. 
The derivative of |ly(t)|? = yf + y3 + y§ is 2yiy, + 2yoys + 2ysy5. 
Substitute y{, y, y3 to get zero. The energy || y(t)||? stays equal to || y(0)||?. 
When A = —AT is skew-symmetric, e“' is orthogonal. Prove (e4*)™ = e7At 


from the series e4’ = [ + At + $.A7t? +--+. 


The mass matrix / can have masses m; = 1 and mz = 2. Show that the eigenvalues 
for Kx = \Maw are = 24+ V2, starting from det(kK — AM) = 0: 


2 
—2 


LO) 


0 | and K = | 


M= 4 are positive definite. 


Find the two eigenvectors x; and a2. Show that xt ate ~ 0 but xt Mx = 0. 
What difference equation would you use to solve y” = —Sy? 


The second order equation y” + Sy = 0 reduces to a first order system yi’ = Yo 
and yo’ = —Sy,. If Sx = wz show that the companion matrix A = [0 I; —S 0] 
has eigenvalues iw and —iw with eigenvectors (a, iwa) and (x, —iwa). 


Find the eigenvalues \ and eigenfunctions y(z) for the differential equation 
y” = dy with y(0) = y(a) = 0. There are infinitely many ! 
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Table of Eigenvalues and Eigenvectors 


How are the properties of a matrix reflected in its eigenvalues and eigenvectors? 
This question is fundamental throughout Chapter 6. A table that organizes the key facts may 
be helpful. Here are the special properties of the eigenvalues \; and the eigenvectors x;. 


Symmetric: ST = $ real )’s orthogonal x} x; = 0 
Orthogonal: QT = Q-! all |A| = 1 orthogonal #} a; = 0 
Skew-symmetric: AT = —A imaginary \’s orthogonal %} 2; = 0 
Complex Hermitian: g =S real ’s orthogonal %} x; = 0 
Positive Definite: x7 Sx > 0 all A > 0 orthogonal since ST = S 
Markov: m;; > 0, 3), mij =1  Amax = 1 steady state x > 0 
Similar: B = V~-1AV (B) = (A) x(B) = V~12(A) 
Projection: P = P? = PT DN eeaed column space; nullspace 
Plane Rotation: cos 6, sin 0 e? ande—® x = (1,4) and (1, —2) 
Reflection: J — 2uu™ Nes eles lel u; whole plane u+ 
Rank One: uv™ \= vu; 0,..,0 u; whole plane vt 
Inverse: A~! 1/A(A) keep eigenvectors of A 
Shift: A + cl (A) +¢ keep eigenvectors of A 
Function: any f(A) FOR) sos tN) keep eigenvectors of A 
Stable Powers: A” — 0 all |A| <1 any eigenvectors 
Stable Exponential: e4' > 0 all Re A < 0 any eigenvectors 
Tridiagonal: diagonals —1,2,-1 A, = 2—2cos Ae Lp = (sin #E., sin oe, ny .) 


Factorizations Based on Eigenvalues (Singular Values in ») 


Diagonalizable: A= VAV~! diagonal of A has ); eigenvectors in V 
Symmetric: S = QAQT diagonal of A (real 4;) orthonormal eigenvectors in Q 
Jordan form: J = V7! AV diagonal of J is A each block gives x = (0,.., 1,.., 0) 


SVD for any A: A=USV?™  rank(A) = rank(>) eigenvectors of AT A, AAT in V,U 
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= CHAPTER 6 NOTES #& 


A symmetric matrix S has perpendicular eigenvectors. Suppose Sa = a and 
Sy = dA2y and \; # Ag. Subtract A, J from both equations : 


(S = Ail)x =0 and (S = Aly = (A2 = Ai)y. 


This puts x in the nullspace and y in the column space of S — AiJ. That matrix is real 
symmetric, so its column space is also its row space. Then z in the nullspace is sure to be 
perpendicular to y in the row space. A new proof that x? y = 0. 

Several proofs that S has a full set of n independent (and orthogonal) eigenvectors— 
even in the case of repeated eigenvalues—are on the course website for linear algebra: 
web.mit.edu/18.06 (Proofs of the Spectral Theorem). 


Similar Matrices and the Jordan Form 


For every A, we want to choose V so that V~! AV is as nearly diagonal as possible. When 
A has a full set of n eigenvectors, they go into the columns of V. Then the matrix V-!AV 
is diagonal, period. This matrix A is the Jordan form of A—when A can be diagonalized. 
But if eigenvectors are missing, A can’t be reached. 

Suppose A has s independent eigenvectors. Then it is similar to a matrix with s blocks. 
Each block has the eigenvalue X on the diagonal with 1’s just above it. This block accounts 
for one eigenvector. When there are n eigenvectors and n blocks, J is A. 


The Jordan form J has an off-diagonal 1 for each missing eigenvector (and the 1’s are next 
to the eigenvalues). This is the big theorem about matrix similarity. In every family of 
similar matrices, we are picking one outstanding member called J. It is nearly diagonal 
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(or if possible completely diagonal). We can solve dz/dt = Jz by back substitution. 
Then we have solved dy/dt = Ay with y = Vz. 

Jordan’s Theorem is proved in my textbook Linear Algebra and Its Applications. 
The reasoning is rather intricate and the Jordan form is not at all popular in computations. 
A slight change in A will separate the repeated eigenvalues and bring a diagonal A. 


Time-varying systems y’ = A(¢)y: Wrong formula and correct formula for y(¢) 
Section 6.4 recognized that linear systems are more difficult when the matrix depends on t. 
The formula y(t) = exp(f A(t)dt)y(0) is not correct. The underlying reason is that e4+? 
(the wrong matrix) is generally different from e“e? (the correct matrix at t = 2, when the 
system jumps from y’ = By to y’! = Ay at t = 1.) Go forward in time: e? and then e4. 

It is not usual for a basic textbook to attempt a correct formula. But this is a chance to 
emphasize that Euler’s difference equation goes forward in the right order. It steps from Y , 
at time nAt to Y 4 at time (n + 1)At, using the current matrix A at time nAt. 


Euler’s method AY /At= AY or Yn4i = EnYn with E, = 1+ AtA(nAt). 
When we reach Y y, we have multiplied Yo by N matrices Eg to En_, in the right order : 
Yn = En—,En~_2 eee FE, EoY 0. 


Basic theory says that Euler’s Y \ approaches the correct y(t), when At = t/N and 
N — oo. That product of E’s approaches the correct replacement for e4*. When <A is a 
constant matrix, not changing with time, all E’s are the same and we reach e“t from EX : 


Agya 
Constant matrix A e4# = limit of (I+ AtA)™ = limit of (1 + 7) ; 


This came from compound interest in Section 1.3, when A was a number (1 by 1 matrix). 
The limit of Ey-1EN_2...E,Epo is called a product integral. An ordinary 
“sum integral” [ A(t)dt is the limit of a sum of N terms AtA (each term going to zero). 
Now we are multiplying N terms J + AtA (each term going to J). Term by term, 
I + AtA is close to e4’4. But matrices don’t always commute, and exp { A(t)dt is 
wrong. Matrix products Ey _1...E Eo approach a product integral and the correct y(t). 


Product integral M/(t) = limitof Ey_jEn_2...E; Eo. Then y(t) = M(t)y(0). 


One final good note. The determinant W(t) of the matrix M(t) has a nice formula. 
This succeeds because numbers det A (but not matrices A) can be multiplied in any order. 
Here is the beautiful fact that gives the equation for the Wronskian determinant W (t) : 


dM 


dw : 
If an = AM then re (trace(A))W. Therefore W(t) = eJ te (AM) 4tyy (9), 


This is equation (21) in Section 6.4. We see again that the Wronskian W(t) is never zero, 
because exponentials are never zero. For y” + B(t)y’ + C(t)y = 0, the companion matrix 
has trace — B(t). The Wronskian is W(t) = e~ J ®“4*W(0) as Abel discovered. 


Chapter 7 


Applied Mathematics and ATA 


A chapter title that includes the symbols ATA is not usual. Most textbooks deal with A 
and its eigenvalues, and stop. When the original problem involves a rectangular matrix, 
as so many problems do, the steps to reach a square matrix are omitted. In reality, 
rectangular matrices are everywhere—they connect current and voltage, displacement 
and force, position and momentum, prices and income, pairs of unknowns. 


It is true that the eventual equation contains a square matrix (very often symmetric). 
We start from A and we reach A’ A. Those two matrices have the same nullspace. We want 
A’ A to be invertible so we can solve the problem. Then A must have independent columns 
(no nullspace except the zero vector) as we now assume: A must be “tall and thin” with 
m > nand full column rank r = n. 


S = A'A has positive eigenvalues. It is a positive definite symmetric matrix. Its 
eigenvectors lead us to the Singular Value Decomposition of A. The SVD in Section 7.2 
is the best way to discover what is important, when a large matrix is filled with data. 
The singular vectors are like eigenvectors for a square matrix, with the extra guarantee of 
orthogonality. 


The chapter starts with m equations in n unknowns—too many equations, too few 
unknowns, and no solution to Av = b. This is a major application of linear algebra 
(and geometry and calculus). A sensor or a scanner or a counter makes thousands of 
measurements. Often we are overwhelmed with data. If it lies close to a straight line, 
that line vy + vet or C + Dt has only n = 2 parameters. Those are the two numbers 
we want, coming from m = 1000 or 1000000 measurements. 


Our first applications, are least squares and weighted least squares. The 2 by 2 matrix 
ATA or ATCA will appear (C contains the weights). This is the symmetric matrix S of 
Section 6.5 and Section 7.1, and the stiffness matrix K of Section 7.4, and the conductance 
matrix of Section 7.5, and the second derivative ATA = —d?/dz? in 7.3. (A minus sign is 
included, because if A = d/dz is the first derivative then —d/dz is its transpose.) 


“Symmetric positive definite’—those are three important words in linear algebra. 
And they are key ideas in applied mathematics, to be presented in this chapter. 
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7.1. Least Squares and Projections 


Start with Av = b. The matrix A has n independent columns; its rank is n. But A has m 
rows, and m is greater than n. We have m measurements in b, and we want to choose 
nm < m parameters v that fit those measurements. An exact fit Av = 6 is generally 
impossible. We look for the closest fit to the data—the best solution U. 

The error vector e = b — At tells how close we are to solving Av = b. The errors in 
the m equations are €1,...,€m. Make the sum of squares as small as possible. 


Least squares solution & Minimize ||e||? = e? + --- + e?, = ||b— Av||?. 


This is our goal, to reduce e. If Av = b has a solution (and possibly it could), then 
the best ¥ is certainly that solution vector v. In this case the error is e = O, certainly 
a minimum. But normally there is no exact solution to the m equations Av = b. The 
column space of A is only an n-dimensional subspace of R™. Almost all vectors b are 
outside that subspace—they are not combinations of the columns of A. We reduce the error 
E = |le||? as far as possible, but we cannot reach zero error. 


Example 1 Find the straight line b = C' + Dt that goes through 4 points: b = 1,9,9, 21 
at t = 0,1,3,4. Those are four equations for C and D, and they have no solution. The four 
crosses in Figure 7.1 are not on a straight line: 


C+0D = 1 1 0 1 
Av = b has C+1D = 9 , 1 1 Gy. 9 1 
no solution C+3D Gea a i 3 as | S 9 \|'= m 
C+4D =21 1 4 21 


C = 1 solves the first equation, then D = 8 solves the second equation. Then the other 
equations fail by a lot. We want a better balance, where no equation is exact but the total 
squared error FE = e? + e% + e3 + e% from all four equations is as small as possible. 


The best C' and D are 2 and 4. The best v is 0 = (2,4). The best line is 2 + 4t. 
At the four measurement times t = 0,1,3,4, this best line has heights 2,6, 14,18. 
In other words, A¥ is p = (2,6,14,18) which is as close as possible to b = (1,9, 9, 21). 


For that vector p = (2,6, 14,18), the four bullets in Figure 7.1 fall on the line 2 + 4t. 
How do we find that best solution © = (C,, D) = (2,4) ? It has the smallest error E: 


E=ej+e3+e3+e; = (l-C—-0D)?+(9-C-1D)?+(9-C—3D)?+(21—-C—4D)?. 


We can use pure linear algebra to find C = 2 and D = 4, or pure calculus. To use calculus, 
set two partial derivatives to zero: OF /OC' = 0 and OE/OD = 0. Solve for C and D. 
Linear algebra gives the right triangle in Figure 7.1. The vector b is split into p + e. 
The heights p lie on a line and the errors e are as small as possible. I will use calculus first, 
and then the linear algebra that I prefer—because it produces a right triangle p + e = b. 
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best line = doe 
2+ 4t ¥ 


projection 
P= of b onto 
c= columns 
1 
of A 


by 


Figure 7.1: Two pictures ! The best line has eTe = 1+9+25+9 = 44 = ||b— p||?. 


Let me give away the answer immediately (the equation for C and D). Then you can 
compute the best solution © and the projection p = A¥ and the errore = b — Av. 
The best least squares estimate 1 = (C, D) solves the “normal equations” using the 
square symmetric invertible matrix AT A : 


Normal equations to find 0 AT AD = AT. (2) 


In short, multiply the unsolvable equations Av = b by A™ to get ATAU = A™D. 


Example 1 (completed) The normal equations A? Ad = ATb are 


1 
ne es Oe 
lei a 9 | ” 

1 


ee ee 


After multiplication this matrix AT A is square and symmetric and positive definite : 


eee eee 4 8 C}_ [40 ; on a 
A’ Av = A*b k »] |S =| 490 gives B|=|4|: (4) 


Att = 0,1,3,4 this best line 2 + 4¢ in Figure 7.1 has heights p = 2,6, 14,18. The min- 
imum error b — pis e = (—1,3,—5,3). The picture on the right is the “linear algebra 
way” to see least squares. We project b to p in the column space of A (you see how p 
is perpendicular to the error vector e). Then Av = p has the best possible right side p. 


The solution ¥ = (iC ; D) = (2,4) is the least squares choice of C and D. 
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Normal equations using calculus 
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The two equations are OE /0C' = 0 and 0OE/0D = 0. 


The first column shows the four terms e7 + e3 + e% + e7 that add to E. Next to them 
are the derivatives that add to OF /0C and 0E/OD. Notice how the chain rule brings factors 


0, 1,3, 4 in the third column for 0E/0D. 


Add (C+0D-1)? 2(C +0D —1) 2(C + 0D — 1)(0) 

nh pe (CtiD-9? BE_wxC+1D-9) BE 2%C+1D-9)(1) 
— "(C+ 309) 96 HCLED=95) ap C3 =9G) 
ea (4p 321) CHAD = 31) 2(C +4D — 21)(4) 


No problem to divide all derivatives by 2, when OF /OC = 0 and 0E/0D = 0. The last 
two columns are added by matrix multiplication (notice the numbers 0,1, 3,4 in OE /OD). 


C+0D- 1 
Faro a at a | C+1D—- 9 =(9 (5) 
2| OE/OD Ot aud C+3D- 9| |[0]° 
C+4D — 21 


The 2 by 4 matrix is AT. The 4 by 1 vector is Ad — b. Calculus has found AT Av = ATDb. 


Example 2 Suppose we have two equations for one unknown v. Thus n = 1 but m = 2 
(probably there is no solution). One unknown means only one column in A : 


= : ay Ls by 2v=1 
Av=b is | is v= | bs For example aye” (6) 
The matrix A is 2 by 1. The squared error is H = e? + e3 = (1 — 2v)? + (8 — 3v)?. 
Sum of squares E(v) = (b — a,v)? + (bz — agv)?. 


The graph of (wv) is a parabola. Its bottom point is at the least squares solution 0. The 
minimum error occurs when dE’/dv = 0: 


dE 


Equation for 0 
quation for v ae 


= 2a, (a, ¥ — b)) + 2ag (aad — bo) = 0. (7) 


Cancel the 2’s, so (a? + a3)@ = (a bi + agb2). The left side has a? + a3 = ATA. 
The right side is a,b; + agbg = A’. Calculus has again found ATAD =A"Dd: 


a,b; + ab 
2 2 
ay + a5 


[ai a} be ®@=[a, a2] fe produces 3 = —— = (8) 


The numerical example has a = (2,3) and b = (1,8) and = a? b/a‘a = 26/13 = 2. 
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Example 3 The special case aj = ag = 1 has two measurements v = b, and v = bo 
of the same quantity (like pulse rate or blood pressure). The matrix has AT = [1 1]. 
To minimize (v — b1)? + (v — be), the best @ is just the average measurement: 


If a,;=a2=1 then ATA=2 and ATh=b,+b2 and G=(b; +b2)/2. 


The linear algebra picture in Figure 7.2 shows the projection of b onto the line through a. 
The projection is p, the angle is 90°, and the other side of the right triangle is e = b — p. 
The normal equations are saying that e is perpendicular to the line through a. 


Least Squares by Linear Algebra 
Here is the linear algebra approach to AT Ad = A'b. It takes one wonderful line : 
e = b— Adis perpendicular to the column space of A. So e is in the nullspace of AT. 


Then A™b = A? AB. That fourth subspace N(A7™) is exactly what least squares needs: e 
is perpendicular to the whole column space of A and not just top = AU = A(ATA)~1A™D. 


Figure 7.2 shows the projection p as an m by m matrix P multiplying b. To project any 
vector onto the column space of A, multiply by the projection matrix P. 


rs 


aa 
Projection matrix gives p = Pb P= —— or P= A(A™A)~1A™. | @Q) 
ata 


The first form of P gives the projection on the line through a. Here A has only one 
column and ATA = a?a. We can divide by that number, but for n > 1 the right notation 
is (AT A)~!. The second form gives P in all cases, provided only that AT A is invertible: 


Two key properties of projection matrices PT =P and P? =P. (10) 


The projection of p is p itself (because p = Pb is already in the column space). Then 
two projections give the same result as one projection: P(Pb) = Pb and P? = P. 


p = Ad = A(ATA)-!ATD 


Figure 7.2: The projection p is the nearest point to b in the column space of A. 
Left (rn = 1) : column space = line through a. Right (n = 2): Column space = plane. 
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Let me review the four essential equations of (unweighted) least squares : 


Av=b m. equations, n unknowns, probably no solution 


AT AG = ATb normal equations, 6 = (ATA)! A™Tb = best v 


p = At = A(ATA)~!ATD projection p of b onto the column space of A 


P=A(CAT A) At projection matrix P produces p = Pb for any b 
1a &O 6 
Example4 If 4 =] 1 1 | and b= | O | find © and p and the matrix P. 
de 2 0 


Solution Compute the square matrix A‘ A and also the vector A‘b : 


TO 6 
Sean Cae ae Sica istic _ [6 
ri eae ge dt <4 = irate and ES 0 = 15 Ie 
ley, 
Now solve the normal equations AT AU = A™bto find@: 


ee ean bari Gil) chee cae 5 
[ssllel*lo] = e-[2)—[3]- av 


The combination p = Av is the projection of b onto the column space of A: 


1 0 5 1 
p=5|1}] -—3]1] = 2|. Theerroris e = b—p= |—-2]. (12) 
2 —] 1 
Two checks on the calculation. First, the error e = (1,—2,1) is perpendicular to both 


columns (1, 1, 1) and (0, 1,2). Second, the projection matrix P times b = (6, 0,0) correctly 
gives p = (5,2, —1). That solves the problem for one particular b. 

To find p = Pb for every b, compute P = A(ATA)~!A7. The determinant of ATA is 
15 — 9 = 6; then (AT A)~! is easy. Multiply A times (AT A)~! times AT to reach P: 


Se ee 

n 5 _ 

(ate S| S| apo eo 2 (13) 
Le 3 Pai Oe ce 


We must have P? = P, because a second projection doesn’t change the first projection. 


Warning The matrix P = A(A™A)~1A? is deceptive. You might try to split (ATA)~1 
into A~! times (AT)~1. If you make that mistake, and substitute it into P, you will find 
P = AA~1(A™)—1A™. Apparently everything cancels. This looks like P = I, the identity 
matrix. The next two lines explain why this is wrong. 
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The matrix A is rectangular. It has no inverse matrix. We cannot split (ATA)~! into 
A~! times (AT)~! because there is no A~! in the first place. 

In our experience, a problem that involves a rectangular matrix almost always leads to 
ATA. When A has independent columns, A’ A is invertible. This fact is so crucial that we 
state it clearly and give a proof. 


A’ A is invertible if and only if A has linearly independent columns. 


Proof A‘ A is a square matrix (n by n). For every matrix A, we will now show that 
AT A has the same nullspace as A. When A has independent columns, its nullspace contains 
only the zero vector. Then A’ A, with this same nullspace, is invertible. 

Let A be any matrix. If x is in its nullspace, then Aa = 0. Multiplying by AT gives 
A? Ax = 0. So z is also in the nullspace of AT A. 

Now start with the nullspace of ATA. From A‘Aax = O we must prove Ax = 0. 
We can’t multiply by (AT)~!, which generally doesn’t exist. Just multiply by 7 : 


(x7)ATAe=0 or (Ax)™(Az)=0 or ||Aa|l? =0. 


This says : If AT Aw = O then Az has length zero. Therefore Ax = 0. 
Every vector in one nullspace is in the other nullspace. If AT A has dependent columns, 
so has A. If A? A has independent columns, so has A. This is the good case: 


When A has independent columns, A™ A is square, symmetric, and invertible. 


To repeat for emphasis : ATA is (n by m) times (m by n). Then ATA is square (n by n). 
It is symmetric, because its transpose is (ATA)? = A™(AT)? which equals ATA. We just 
proved that A’ A is invertible—provided A has independent columns. Watch the difference 
between dependent columns and independent columns: 


At A ATA At A ATA 

EA “ = [78 707 |T: - [24] 

22 0 00 48 eH 01 49 
dependent singular independent invertible 


Very brief summary To find the projection p = 01a; + --- + Gn@n, solve ATAD = ATD. 
This gives ©. The projection is Av and the error is e = b — p = b — AW. The projection 
matrix P = A(A?A)~1A™ multiplies b to give the projection p = Pb. 

This matrix satisfies P? = P. The distance from b to the subspace is |le||. 


392 Chapter 7. Applied Mathematics and AT A 


Weighted Least Squares 


There is normally error in the measurements b. That produces error in the output v. 
Some measurements b; may be more reliable than others (from less accurate sensors). 
We should give heavier weight to those reliable };. 

We assume that the expected error in each b; is zero. Then negative errors balance 
positive errors in the long run, and the mean error is zero. The expected squared error 
in the measurement b; (the “mean squared error”) is its variance o;? : 


Mean m;=FEle;)=0 Variance o;” = expected squared error E[e?] (14) 
We should give equation 7 more weight when o; is small. Then b; is more reliable. 

Statistically, the right weight is w; = 1/o0;. We multiply Av = b by the diagonal matrix 
W with those weights w1,...,Wm. Then solve WAv = Wb by ordinary least squares, 
using W A and Wb instead of A and b: 


Weighted least squares (W.A)™(W.A)o=(W.A)™Wb is ATCAd=ATCD. (15) 


C =W'™W goes between A’ and A, to produce the weighted matrix K = ATC A. 


Example 5 Your pulse rate v is measured twice. Using unweighted least squares 
(w, = wa = 1), the best estimate is 0 = 3(b1 + by). Example 3 finds that least square 
solution @ to two equations v = 6; and v = ba. But if you were more nervous the first 
time, then o, is larger than a2. The first measurement b, has a larger variance than bo. 

We should weight the two measurements by w; = 1/01 and w2 = 1/o2: 
WV = wd, _ wibs + web (16) 
wou = wab2 wy + we 


S) 


With weights 


When w, = w2 = 1, that answer U reduces to the unweighted estimate 5 (br + bo). 

The weighted K = A‘CA has the same good properties as the unweighted ATA: 
square, symmetric, and invertible when A has independent columns (as in the example). 
Then all eigenvalues of A‘ A and ACA have \ > 0: positive definite matrices ! 


= REVIEW OF THE KEYIDEAS & 


1. The least squares solution % minimizes E = ||b — Av||?. Then ATAU = ATD. 

2. To fit m points by a line C + Dt, A is m by 2 and = (G D) gives the best line. 

3. The projection of b on the column space of A is p = Av = Pb: closest point to b. 
4. The error is e = b — p. The projection matrix is P = A(A?A)~1 A? with P? = P. 
5. Weighted least squares has ATC AU = ATC. Good weights c; are 1/variance of bj. 
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Problem Set 7.1 


Suppose your pulse is measured at b; = 70 beats per minute, then b2 = 120, then 
bs = 80. The least squares solution to three equations v = b1,v = b2,v = b3 with 
Ay =| ise = (AA) tA . Use calculus and projections : 


(a) Minimize E = (v — 70)? + (v — 120)? + (v — 80)? by solving dE/dv = 0. 
(b) Project b = (70, 120, 80) onto a = (1,1, 1) to find ¥ = a™b/a‘a. 


Suppose Av = b has m equations a;v = b; in one unknown v. For the sum of squares 
E = (ayv — by)? +--+ + (@mv — bm)?, find the minimizing 0 by calculus. Then form 
A™ Ad = A™b with one column in A, and reach the same 0. 


With b = (4, 1,0, 1) at the points x = (0, 1, 2,3) set up and solve the normal equation 
for the coefficients ¥ = (C,, D) in the nearest line C+ Dz. Start with the four equations 
Av = b that would be solvable if the points fell on a line. 


In Problem 3, find the projection p = Av. Check that those four values lie on the line 
C + Dx. Compute the error e = b — p and verify that ATe = 0. 


(Problem 3 by calculus) Write down E = ||b — Av||? as a sum of four squares: the 
last one is (1 — C — 3D)?. Find the derivative equations DE/OC = OE/OD = 0. 
Divide by 2 to obtain AT AG = AT. 


For the closest parabola C+ Dt+ Et? to the same four points, write down 4 unsolvable 
equations Av = b for v = (C, D, E). Set up the normal equations for ¥. If you fit the 
best cubic C + Dt + Et? + Ft? to those four points (thought experiment), what is the 
error vector e ? 


Write down three equations for the line b = C' + Dt to go through b = 7 at 
t = -1,b = 7att = 1, and b = 21 att = 2. Find the least squares solution 
v = (C, D) and draw the closest line. 


Find the projection p = Av in Problem 7. This gives the three heights of the closest 
line. Show that the error vector is e = (2, —6, 4). 


Suppose the measurements at t = —1,1,2 are the errors 2,—6,4 in Problem 8. 
Compute © and the closest line to these new measurements. Explain the answer: 
b = (2, —6, 4) is perpendicular to so the projection is p = 0. 

Suppose the measurements at t = —1,1,2 are b = (5, 13,17). Compute © and the 


closest line e. The error is e = O because this b is 
Find the best line C' + Dt to fit b = 4, 2, —1,0, Oat times t = —2, —1,0,1, 2. 


Find the plane that gives the best fit to the 4 values b = (0,1,3,4) at the corners 
(1,0) and (0, 1) and (—1, 0) and (0, —1) of a square. At those 4 points, the equations 
C+ Dxz+ Ey = bare Av = b with 3 unknowns v = (C, D, £). 
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13 With b = 0,8, 8, 20 at t = 0,1, 3,4 set up and solve the normal equations AT Av = 
A™b. For the best straight line C + Dt, find its four heights p; and four errors e;. 
What is the minimum value F = e7 + e3 + e3 + €3? 


14 (By calculus) Write down E = ||b — Avj|? as a sum of four squares—the last one 
is (C + 4D — 20). Find the derivative equations 0E/OC = 0 and 0E/OD = 0. 
Divide by 2 to obtain the normal equations AT Ad = A™b. 


15 Which of the four subspaces contains the error vector e ? Which contains p ? Which 
contains U ? 


16‘ Find the height C of the best horizontal line to fit b = (0,8,8,20). An exact fit 
would solve the four unsolvable equations C = 0,C = 8,C = 8,C 20. Find 
the 4 by 1 matrix A in these equations and solve AT Av = A™b. 


17 Write down three equations for the line b = C+ Dt to go through b = 7 at 
t = -1,b = 7att = 1, and b = 21 at t = 2. Find the least squares solution 
v = (C, D) and draw the closest line. 


18 Find the projection p = Av in Problem 17. This gives the three heights of the closest 
line. Show that the error vector is e = (2, —6, 4). Why is Pe = 0? 


19 Suppose the measurements at t = —1, 1, 2 are the errors 2, —6, 4in Problem 18. Com- 
pute % and the closest line to these new measurements. Explain the answer: 
b = (2, —6, 4) is perpendicular to so the projectionis p = 0. 

20 Suppose the measurements at ¢ = —1,1,2 are b = (5,13, 17). Compute ¥ and the 
closest line and e. The error is e = O because this b is ? 


Questions 21-26 ask for projections onto lines. Also errors e = b — p and matrices P. 


21 ‘Project the vector b onto the line through a. Check that e is perpendicular to a: 


1 1 —1 
(a) b= | 2 and a=] 1 (b) b=] 3 and a=| -3 
3 1 —1 


22 ‘Draw the projection of b onto a and also compute it from p = va: 


@ b= | OF | anda =| 6 | (b) b= |} |aa=| 1 |. 


23 In Problem 22 find the projection matrix P = aa™/a‘a onto each vector a. Verify 
in both cases that P? = P. Multiply Pb in each case to find the projection p. 


24 Construct the projection matrices P; and P: onto the lines through the a’s in 
Problem 22. Is it true that (P, + P2)? = P, + Po ? This would be true if P, Py = 0. 


25 Compute the projection matrices aa’ /a‘a onto the lines through a; = (—1, 2, 2) 
and ag = (2,2, —1). Multiply those two matrices P; P2 and explain the answer. 
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26 Continuing Problem 25, find the projection matrix P3 onto ag = (2,—1,2). Verify 
that P, + P) + P; = I. The basis a1, a2, @3 is orthogonal ! 


27 ‘Project the vector b = (1,1) onto the lines through a; = (1,0) and a2 = (1,2). 
Draw the projections p, and py and add p, + pz. The projections do not add to b 
because the a’s are not orthogonal. 


28 (Quick and recommended) Suppose A is the 4 by 4 identity matrix with its last column 
removed. A is 4 by 3. Project b = (1, 2,3, 4) onto the column space of A. What shape 
is the projection matrix P and what is P? 


29 If A is doubled, then P = 24(4A7A)~!2A7. This is the same as A(ATA)~! AT, 
The column space of 2A is the same as . Is © the same for A and 2A? 


30 What linear combination of (1,2, —1) and (1,0, 1) is closest to b = (2,1,1)? 


31 (Important) If P? = P show that (I— P)? = I—P. When P projects onto the column 
space of A, I — P projects onto which fundamental subspace ? 


32 = If P is the 3 by 3 projection matrix onto the line through (1, 1,1), then J — P is the 
projection matrix onto 


33. Multiply the matrix P = A(A‘A)~1A? by itself. Cancel to prove that P? = P. 
Explain why P(Pb) always equals Pb: The vector Pb is in the column space so its 
projection is 


34 If A is square and invertible, the warning against splitting (A A)~! does not apply. 
Then AA~!(AT)—1A™ = J is true. When A is invertible, why is P = I ande = 0? 


35 An important fact about A’ A is this: If AT Ax = 0 then Ax = 0. New proof : 
The vector Az is in the nullspace of . Ag is always in the column space of 
. To be in both of those perpendicular spaces, Aa must be zero. 


Notes on mean and variance and test grades 

If all grades on a test are 90, the mean is m = 90 and the variance is 0? = 0. Suppose 
the expected grades are g1,...,gNn- Then a? comes from squaring distances to the mean: 
ie me ‘ go (GS) ah (gee = ty? 
— Variance = 
N re N 
After every test my class wants to know m and a. My expectations are usually way off. 


Mean m = 


36 Show that o? also equals +(9? +--- + 9%) — m?. 


37 ‘If you flip a fair coin N times (1 for heads, 0 for tails) what is the expected number 
m of heads ? What is the variance a? ? 
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7.2 Positive Definite Matrices and the SVD 


This chapter about applications of ATA depends on two important ideas in linear algebra. 
These ideas have big parts to play, we focus on them now. 


1. Positive definite symmetric matrices (both AT A and ATCA are positive definite) 
2. Singular Value Decomposition (A = UXV™ gives perfect bases for the 4 subspaces) 


Those are orthogonal matrices U and V in the SVD. Their columns are orthonormal 
eigenvectors of AA? and A’A. The entries in the diagonal matrix © are the square 
roots of the eigenvalues. The matrices AAT and A™ A have the same nonzero eigenvalues. 

Section 6.5 showed that the eigenvectors of these symmetric matrices are orthogonal. 
I will show now that the eigenvalues of A™ A are positive, if A has independent columns. 


Start with ATAxw = Aw. Then x? ATA = Ax*x. Therefore \ = ||Aa||*/||x||? > 0 


I separated x? AT Az into (Ax)™(Ax) = ||Az||?. We don’t have \ = 0 because A? A is 
invertible (since A has independent columns). The eigenvalues must be positive. 

Those are the key steps to understanding positive definite matrices. They give us three 
tests on S—three ways to recognize when a symmetric matrix S is positive definite : 


Positive 1. All the eigenvalues of S are positive. 
definite 2. The “energy” x1 Sz is positive for all nonzero vectors 2. 
symmetric 3. Shas the form S = A™ A with independent columns in A. 


There is also a test on the pivots (all > 0) and a test on n determinants (all > 0). 


Example 1 Are these matrices positive definite ? When their eigenvalues are positive, 
construct matrices A with S = ATA and find the positive energy x! Sa. 


425) BoA rig 
(a) ek Al (b) aly a (c) eit Al 


Solution The answers are yes, yes, and no. The eigenvalues of those matrices S are 
(a) 4and1: positive (b) 9 and 1: positive (c) 9 and —1: not positive. 


A quicker test than eigenvalues uses two determinants : the 1 by 1 determinant 5; and 
the 2 by 2 determinant of S. Example (b) has 5); = 5 and det S = 25 — 16 = 9 (pass). 
Example (c) has 5,1 = 4 but det S = 16 — 25 = —9 (fail the test). 
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Positive energy is equivalent to positive eigenvalues, when S is symmetric. Let me test 
the energy x! Sz in all three examples. Two examples pass and the third fails : 


[v1 £2 ; : ie = 47? +23 >0 Positive energy when x 4 0 
2 
[v1 XQ ; : i = 5x? + 8r1 22 + 5x3 Positive energy when x 4 0 
2 
4 5 Uy 2 2 
[v1 2 4 = 4a + 10x, 42 + 4x5 Energy —2 when a = (1,-—1) 


Positive energy is a fundamental property. This is the best definition of positive definiteness. 


When the eigenvalues are positive, there will be many matrices A that give ATA = S. 
One choice of A is symmetric and positive definite! Then ATA is A?, and this choice 
A= VS isa true square root of S. The successful examples (a) and (b) have S = A?: 


We know that all symmetric matrices have the form S = VAV7 with orthonormal 
eigenvectors in V. The diagonal matrix A has a square root VA, when all eigenvalues are 
positive. In this case A = VS = VVAV7 is the symmetric positive definite square root: 


A'A = VSVS = (VVAV")(VVAV") = VVAVAV" = S because V'V = I. 
Starting from this unique square root VS, other choices of A come easily. Multiply VS’ 


by any matrix Q that has orthonormal columns (so that QTQ = I). Then Qv/’S is another 
choice for A (not a symmetric choice). In fact all choices come this way : 


ATA = (QVS)" (QVS) = VSQTQVS = S. (1) 


I will choose a particular Q in Example 1, to get particular choices of A. 


0 
1 


[Doone a, Oe ta | a 


ean 04 21 2. Vliet. s=2 ye || oe 7 
A=|1 alle a =| a has s =ata=[4 5]. 


Example 1 (continued) Choose Q = = to multiply VS. Then A = QV‘S. 


A 
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Positive Semidefinite Matrices 


Positive semidefinite matrices include positive definite matrices, and more. Eigenvalues of 
S can be zero. Columns of A can be dependent. The energy x! S'x can be zero—but not 
negative. This gives new equivalent conditions on a (possibly singular) matrix S = S7. 


1’ All eigenvalues of S' satisfy \ > 0 (semidefinite allows zero eigenvalues). 
2' The energy is nonnegative for every x: 2! Sx >0 (zero energy is allowed). 
3’ Shas the form ATA (every A is allowed; its columns can be dependent). 


Example 2 The first two matrices are singular and positive semidefinite—but not the third : 


0 0 4 4 a _[-4 4 
(d) s=\ 5 | (e) eal ‘4 (f) s-| ; alk 


The eigenvalues are 1, 0 and 8,0 and —8, 0. The energies x7 Sx are x3 and 4(x1 + x2) and 
—4(a1 — x2)*. So the third matrix is actually negative semidefinite. 


Singular Value Decomposition 


Now we start with A, square or rectangular. Applications also start this way—the matrix 
comes from the model. The SVD splits any matrix into orthogonal U times diagonal % times 
orthogonal V". Those orthogonal factors will give orthogonal bases for the four 
fundamental subspaces associated with A. 

Let me describe the goal for any m by n matrix, and then how to achieve that goal. 


Find orthonormal bases v1,...,v, for R” and wj,...,u , for R™ so that 


Avi =01U, ... Av, =0;U, Avri1=0 Bees Av, = 0 


The rank of A is r. Those requirements in (4) are expressed by a multiplication AV = UX. 
The r nonzero singular values 0, > o2 >... > 0, > 0 are on the diagonal of ©: 


O71 0 


AVi=US: Al Oy sas Ue aos Va. = | Ur we ep ee Aly “ (3) 


Or 


0 0 


The last n — r vectors in V are a basis for the nullspace of A. The last m — r vectors in U 
are a basis for the nullspace of AT. The diagonal matrix © is m by n, with r nonzeros. 
Remember that V~! = V", because the columns v1,...,U,, are orthonormal in R”: 


Singular Value Decomposition AV =U becomes A=UDV"™. (4) 
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The SVD has orthogonal matrices U and V, containing eigenvectors of AA’ and A’ A. 
Comment. A square matrix is diagonalized by its eigenvectors : Ax; = ;ax; is like 
Av; = o;u;. But even if A has n eigenvectors, they may not be orthogonal. We need nvo 
bases—an input basis of v’s in R” and an output basis of u’s in R™. With two bases, any 
m by n matrix can be diagonalized. The beauty of those bases is that they can be chosen 
orthonormal. Then UTU = I and V'V = J. 

The v’s are eigenvectors of the symmetric matrix S = A'A. We can guarantee their 
orthogonality, so that U1 v; = 0 for j # i. That matrix S is positive semidefinite, so its 
eigenvalues are 0? > 0. The key to the SVD is that Av; is orthogonal to Av; : 


DF ater : 
T ty yt Se RO Se On ALS 
Orthogonal u’s (Avj;)°(Av;) = v; (A Avi) = ¥; (07 vi) = { 0 if j43 (5) 
This says that the vectors u; = Av;/o; are orthonormal fori = 1,...,7. They are a basis 
for the column space of A. And the u’s are eigenvectors of the symmetric matrix AAT, 
which is usually different from S = ATA (but the eigenvalues o?,...,0? are the same). 


Example 3 Find the input and output eigenvectors v and wu for the rectangular matrix A: 


=f BS Watt 
a=| 7 7 9 |=vEv". 


Solution Compute S = A™ A and its unit eigenvectors v1, v2, v3. The eigenvalues a? 


are 8, 2,0 so the positive singular values are 0; = V/8 and og = J2 : 


5 3 0 i /2 1 V2 0 
ATA=|/3 5 O has Vi J2), w=5 —/2|, ts=|0 
0 0 0 0 0 1 


The outputs vw) = Avy/o, and ug = Av2/o2 are also orthonormal, with 0, = V8 and 
02 = V2. Those vectors wu; and Wy are in the column space of A : 


w-[4 0 8]a-[]me-[4 4 s]a-[] 
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The Fundamental Theorem of Linear Algebra 


I think of the SVD as the final step in the Fundamental Theorem. First come the dimensions 
of the four subspaces in Figure 7.3. Then come the orthogonality of those pairs of subspaces. 
Now come the orthonormal bases of v’s and u’s that diagonalize A: 


SVD 


Ur+1 
nullspace 


nullspace 


of A 
dim n — r 


Figure 7.3: Orthonormal bases of v’s and w’s that diagonalize A: m by n with rank r. 


The “norm” of A is its largest singular value : || A|| = 01. This measures the largest 
possible ratio of ||Av|| to ||v||. That ratio of lengths is a maximum when v = v, and 
Av = 0 ,U,. This singular value o, is a much better measure for the size of a matrix than 
the largest eigenvalue. An extreme case can have zero eigenvalues and just one eigenvector 
(1, 1) for A. But ATA can still be large : if vy = (1, —1) then Av is 200 times larger. 


ia iy Sy has Amax =0. But omax = normof A=200. (6) 
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The Condition Number 


A valuable property of A = UXV is that it puts the pieces of A in order of importance. 
Multiplying a column w; times a row o;v} produces one piece of the matrix. There will be 
r nonzero pieces from r nonzero o’s, when A has rank r. The pieces add up to A, when we 
multiply columns of U times rows of =V"™ : 


T 
s O1V, 
The pieces z ms fie sais a 
haveraakal 20 | or = va =ui(o1v;) +-*>+Ur(orv, ). (7) 


The first piece gives the norm of A which is a1. The last piece gives the norm of A~!, which 
is 1/o, when A is invertible. The condition number is o; times 1/op : 


Condition number of A c(A) = ||A|| || A7? || = me (8) 
n 


This number c(A) is the key to numerical stability in solving Av = b. When A is an 
orthogonal matrix, the symmetric S$ = AT A is the identity matrix. So all singular values of 
an orthogonal matrix are 0 = 1. At the other extreme, a singular matrix has o, = 0. 
In that case c = oo. Orthogonal matrices have the best condition number c = 1. 


Data Matrices : Application of the SVD 


“Big data” is the linear algebra problem of this century (and we won’t solve it here). 
Sensors and scanners and imaging devices produce enormous volumes of information. 
Making decisive sense of that data is the problem for a world of analysts (mathematicians 
and statisticians of a new type). Most often the data comes in the form of a matrix. 


The usual approach is by PCA—Principal Component Analysis. That is essentially the 
SVD. The first piece OU, vT holds the most information (in statistics this piece has 
the greatest variance). It tells us the most. The Chapter 7 Notes include references. 


= REVIEW OF THE KEYIDEAS #® 


_ 


. Positive definite symmetric matrices have positive eigenvalues and pivots and energy. 
S = A’ Ais positive definite if and only if A has independent columns. 

. TAT Ax = (Ax)™(Azx) is zero when Ax = 0. ATA can be positive semidefinite. 

. The SVD is a factorization A = UXV™ = (orthogonal) (diagonal) (orthogonal). 

. The columns of V and U are eigenvectors of A7.A and AA™ (singular vectors of A). 


. Those orthonormal bases achieve Av; = o;u; and A is diagonalized. 


yr Nn wn FF BW WN 


. The largest piece of A = o,u, uf + ---+0,u,vi gives the norm ||A|| = 1. 
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Problem Set 7.2 


For a 2 by 2 matrix, suppose the 1 by 1 and 2 by 2 determinants a and ac — b? are 
positive. Then c > b?/a is also positive. 


(i) Ay and Az have the same sign because their product A1A2 equals 


(i) That sign is positive because A; + Az equals = 
Conclusion: The tests a > 0, ac — b? > 0 guarantee positive eigenvalues \1, 2. 


Which of Sj, 52, 53, S4 has two positive eigenvalues? Use a and ac — b?, don’t 
compute the ’s. Find an x with x1 S;a < 0, confirming that A, fails the test. 


, _|d 6 21 2 oi We lO | de LO 
s.=[5 4 s= [7 =| s=|,4 4 s=| 5 ae 
For which numbers b and c are these matrices positive definite ? 
1 b 2 4 Cb 
s-[5 9] s=[t | s=[b 


What is the energy g = ax” + 2bry + cy? = x" Sz for each of these matrices? 
Complete the square to write g as asum of squares d;(__)? +do(_)?. 


Vannes. di 3 
s=(} 2] ma s=[} 3]. 


x'Sax = 27122 certainly has a saddle point and not a minimum at (0,0). What 
symmetric matrix S' produces this energy ? What are its eigenvalues ? 


Test to see if A‘ A is positive definite in each case : 


2 
1. 2 1? ee 2 
re FE | and A= : ; and A= 3 9 A : 


Which 3 by 3 symmetric matrices S and T’ produce these quadratic energies ? 


Se. = Dat xe x 1422 T203). Why is S positive definite? 


ore 2a xe + 22 — 2122 — 21123 £203). Why is 7’ semidefinite ? 


Compute the three upper left determinants of S to establish positive definiteness. 
(The first is 2.) Verify that their ratios give the second and third pivots. 


2 0 
Pivots = ratios of determinants S=|2 3 
0 8 


w ob 
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9 For what numbers c and d are S and T positive definite? Test the 3 determinants : 
eAl Lt 2 3 
S= |t “ 4 and T=|2 d 4 
1 c 3.4 5 


10 If S is positive definite then S~' is positive definite. Best proof: The eigenvalues 
of S~1 are positive because . Second proof (only for 2 by 2): 


1 


The entries of S~! = ——— 
ac — b 


c —b ; 
ae pass the determinant tests 


11. If S and T are positive definite, their sum S + T is positive definite. Pivots and 
eigenvalues are not convenient for S + T’.. Better to prove x1(S + T)a > 0. 


12 A positive definite matrix cannot have a zero (or even worse, a negative number) 
on its diagonal. Show that this matrix fails to have x7 Sx > 0: 


4 at 1 XY 
[xy © x3 | 1 O 2] | a] is not positive when (1,%2,73)=( , , ). 
122) 25 23 


13 A diagonal entry a;; of a symmetric matrix cannot be smaller than all the \’s. If it 
were, then A — a;;I would have eigenvalues and would be positive definite. 
But A —a,;I hasa on the main diagonal. 


14 Show that if all X > 0 then x'Sax > 0. We must do this for every nonzero x, 
not just the eigenvectors. So write x as a combination of the eigenvectors and 
explain why all “cross terms” are La; = 0. Then a? Sz is 


a 


(c;21+-- -+én@n)' (A121 +++:+0@nAn2tn) = al a, +-- +07, 2 apy = 0: 


15 Give a quick reason why each of these statements is true: 


(a) Every positive definite matrix is invertible. 

(b) The only positive definite projection matrix is P = I. 

(c) A diagonal matrix with positive diagonal entries is positive definite. 

(d) A symmetric matrix with a positive determinant might not be positive definite ! 
16 With positive pivots in D, the factorization S = LDL™ becomes LVDVDL". 


(Square roots of the pivots give D = VDVD.) Then A = VDL" yields the 
Cholesky factorization S = A™ A which is “symmetrized LU”: 


4 8&8 
8 25 


From A= fe | find S. From S = find A = chol(S). 
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F Troe. cos@ —sin@||2 0O|}| cos@ sind 
17 Without multiplying S = eS 6 ee i F 4 | yan ee | , find 
(a) the determinant of S (b) the eigenvalues of S 
(c) the eigenvectors of S (d) a reason why S is symmetric positive definite. 


18 For Fi(z,y) = 92* + xy + y? and Fo(z,y) = x? + zy — «x find the second 
derivative matrices H; and Ho: 


OF /Ox? OP F /Oxdy 


Test for minimum 4H = 
OF /OyOx  O?F /Oy? 


is positive definite 
H; is positive definite so F is concave up (= convex). Find the minimum point of F; 
and the saddle point of F2 (look only where first derivatives are zero). 

19 The graph of z = 2? + y? is a bowl opening upward. The graph of z = x? — y? is 
a saddle. The graph of z = —x” — y? is a bowl opening downward. What is a test on 


a, b, c for z = ax” + 2bry + cy? to have a saddle point at (0,0)? 


20 Which values of c give a bowl and which c give a saddle point for the graph of 
z = 4a? + 12zy + cy? ? Describe this graph at the borderline value of c. 


21. When S and T are symmetric positive definite, ST’ might not even be symmetric. 
But its eigenvalues are still positive. Start from ST’ = Ax and take dot products 
with Tx. Then prove A > 0. 

22 Suppose C is positive definite (so y'Cy > 0 whenever y # 0) and A has indepen- 
dent columns (so Az 4 O whenever x # 0). Apply the energy test to x? ATC Ax 


to show that ATCA is positive definite : the crucial matrix in engineering. 


23 ‘Find the eigenvalues and unit eigenvectors v;, v2 of AT A. Then find u; = Av /o;: 


Sy see 7-, _|10 20 yp | 5 15 
A=| | and te, a and AA =e mae 


Verify that w, is a unit eigenvector of AAT. Complete the matrices U, ¥, V. 


wo [2 2]=[u = ][* alle #T. 


24 Write down orthonormal bases for the four fundamental subspaces of this A. 


25 (a) Why is the trace of AT A equal to the sum of all ai, } 


(b) For every rank-one matrix, why is of = sum of all a3, ? 
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26 


27 


28 


29 


30 
31 


32 


ok K 


Find the eigenvalues and unit eigenvectors of AA and AA". Keep each Av = ou: 


Fibonacci matrix A= : ; | 


Construct the singular value decomposition and verify that A equals USV7. 
Compute ATA and AA? and their eigenvalues and unit eigenvectors for V and U. 


Cate ie ee BE) 
Rectangular matrix A= 011 | 


Check AV = U® (this will decide + signs in U). & has the same shape as A. 


Construct the matrix with rank one that has Av = 12u forv = $(1, 1,1,1) and 
u = 3(2,2, 1). Its only singular value is 0) = _____ 


Suppose A is invertible (with 0; > agg > 0). Change A by as small a matrix as 
possible to produce a singular matrix Ag. Hint: U and V do not change. 


T 
From A= U1, U2 oe ee V1 V2 find the nearest Ao. 
2 


The SVD for A + I doesn’t use © + I. Why is o(A + J) not just o(A)+I? 


Multiply AT Av = ov by A. Put in parentheses to show that Av is an eigenvector 
of AA’. We divide by its length || Av|| = o to get the unit eigenvector u. 


My favorite example of the SVD is when Av(x) = du/dz, with the endpoint con- 
ditions v(0) = O and v(1) = 0. We are looking for orthogonal functions v(x) 
so that their derivatives Av = dv/dz are also orthogonal. The perfect choice is 
Vv, = sin7az and va = sin 27x and vz; = sin kaa. Then each uz is a cosine. 


The derivative of v; is Av) = mcosmx = muy. The singular values are 01 = 7 
and o, = ka. Orthogonality of the sines (and orthogonality of the cosines) is the 
foundation for Fourier series. 

You may object to AV = UX. The derivative A = d/dz is not a matrix! The 
orthogonal factor V has functions sin k7z in its columns, not vectors. The matrix U 
has cosine functions cos kz. Since when is this allowed ? One answer is to refer you 
to the chebfun package on the web. This extends linear algebra to matrices whose 
columns are functions—not vectors. 

Another answer is to replace d/dzx by a first difference matrix A. Its shape will be 
N +1 by N. A has 1’s down the diagonal and —1’s on the diagonal below. Then 
AV = U®% has discrete sines in V and discrete cosines in U. For N = 2 those will be 
sines and cosines of 30° and 60° in vw, and uw}. 


Can you construct the difference matrix A (3 by 2) and ATA (2 by 2)? The discrete 
sines are v, = (/3/2, V3/2) and v2 = (/3/2, —V3/2). Test that Av; is orthogonal 
to Avg. What are the singular values 0, and a2 in»? 
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7.3 Boundary Conditions Replace Initial Conditions 


This section is about steady-state problems, not initial-value problems. The time variable t 
is replaced by the space variable x. Instead of two initial conditions at t = 0, we have one 
boundary condition at x = 0 and another boundary condition at x = 1. 


Here is the simplest two-point boundary value problem for y(x). Start with f(x) = 1. 


d 
Two boundary conditions Fa f(x) with y(0) =O and y(1) = 0. 
x 


One particular solution yp(x) will come from integrating f(a) twice. If f(a) = 1 then 
two integrations give x7/2, and the minus sign in (1) leads to yp = —2?/2. 


The null solutions y,,(x) solve the equation with zero force: —y\’ = 0. The second 
derivative is zero for any linear function y,, = Cx + D. These are the null solutions. 


We can use those two constants C' and D to satisfy the two boundary conditions on the 
complete solution y(x) = yp + yn = —27/24+ Ca+D. 


1 
y(0)=Oandy(1)=O0 Setx=Oandr=1 D=Oand ~5+C+D=0 
The boundary conditions give D = 0 and C = ‘. Then the solution is y = Yp + Yn: 
2 2 


+i= 
er fo. 


0 1 


Solution nas ey 
pay" St = 


The graph of the parabola starts at y = 0 and returns (fixed ends). The slope y’ = $ — zis 
decreasing. The second derivative is y” = —1 and the parabola is bending down. 
en we ee This boundary-value problem describes a bar that has its top and 
| bottom both fixed. The weight of the bar stretches it downward. 
- At point x down the bar, the displacement is y(z). So this fixed- 
fixed bar has y(0) = O and y(1) = 0. The force of gravity can 
| be f(a) = 1. The bar stretches in the top half where dy/dx > 0. 
Tye) The bottom half is compressed because dy/dx < 0. Halfway down 
= ata = $ is the largest displacement (top of the parabola). That 
halfway point has ymax = 5(« — @”) = ¥. 
I think of this elastic bar as one long spring. If we pulled it down 
in the middle, it would start to oscillate. That is not our problem 


now. Our bar is not moving—the oscillation is all damped out. The 


wee 1 stretching comes from the bar’s own weight. 
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A Delta Function 


This is my chance to introduce again the mysterious but extremely useful function 
f(x) = 6(a — a). This delta function is zero except at x = a. The bar is now so 
light that we can ignore its weight. All the force on the bar is at one point x = a. At 
that point a unit weight is stretching the bar above x = a and compressing the bar below. 

Here is an informal definition of the delta function (the symbol oo doesn’t carry 
enough information by itself). The good definition is based on integrating the function 
across the point x = a. The integral is 1. 


0 «fa f(a —a)dx=1 


Delta function 6(a —a) = tee eee f(a — a) F(x) dx = F(a) 


The graph of 6(x — a) has an infinite spike at x = a. That spike is atx = a = 0 for 
the standard delta function d(x). The function is zero away from the spike and infinite 
at that one point. The area under this one-point spike is 1. 

This tells us that 6(a) cannot be a true function. It is somehow a limit of box functions 
By(za) that have height N over short intervals of width 1/N. The area of each box is 1: 


0 |x| >1/2N [ axe) = box area = 1 


Box functions By(z) = 
. || < 1/2N [ex@r@ dx approaches F'(0) 


Mathematically, 6(x) and its shifts (2 — a) are not functions. Physically, they represent 
action that is concentrated at a single point. In reality that action is probably over a very 
short interval, like the box functions, but the width of that interval is of no importance. 
What matters is the total impulse when a bat hits a ball, or the total force when a weight 
hangs on a bar. 

The shifted delta function 6(a — a) is the derivative of the step function H(z — a). 
The step function jumps from 0 to 1 at x = a. Then 6 must integrate to 1. 


Response to a Delta Function is a Ramp Function 


How to solve the differential equation —y’’ = 6(a — a)? One integration of the delta 
function gives a step function. A second integration gives a ramp function or corner 
function. The solution y(z) must be linear (straight line graph) to the left of z = a, 
because d?y/dx? = 0. And y(z) is also linear to the right of z = a: constant slope. 


The slope of y(x) drops by 1 at the point z = a. To see why —1 is the jump in slope 
(there is no jump in y !), integrate y’’ across the point x = a to get the change —1 in y’: 


dy right of a 
y” = —6(a4—a) [vtae = Fa = [ -5 —a)dt=-—1 (2) 
4 


left of a 
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The solution y(x) starts with a fixed slope s. At x = a it changes to slope s — 1 
(the slope drops by 1). At the point x = 1, the bottom of the bar is fixed at y(1) = 0. 


The constant upward slope s over a distance a and the downward slope s — 1 over the 
remaining distance 1 — a must bring the function y(«) to zero: 


sa+(s—1)(1—a)=0 gives sa+s—sa—1l+a=0. Then s=1-—a. (3) 


The graph of y = sz goes up to sa = (1 — a)a. Then y(x) goes back down to zero. 


slope dy/dx 


Figure 7.4: —y’ = 6(a — a) is solved by a ramp function that has a corner at = a. 
At that corner point the slope y’ (which is a step function) drops by 1. Then y” = —6. 


How is the elastic bar stretched and compressed by this point load atz = a = 4? 
The top third of the bar is stretched, the lower two thirds are compressed. The point x = a 
shows the highest point on the graph of y(zx) and the greatest displacement. That downward 
displacement is y(a) = a(1 — a) = 2. 


Uniform stretching above the point load. Uniform compression below the point load. 


Eigenvalues and Eigenfunctions 


For a square matrix, the eigenvector equation is Ax = Az. For the second derivative (with 


a minus sign) and for a boundary condition at both endpoints, the eigenvector x becomes 
an eigenfunction y(z) : 


d? d*y 
Eigenvalues of — a3 Sara Ay with y(0) =O and y(1)=0. (@ 


We can find these eigenfunctions y(x). The solutions to the second order equation 
y” + Ay = Oare sines and cosines when \ > 0. The boundary conditions choose sines : 


y(z) = Acos(VAz) + B sin(V 2) before applying the boundary conditions 
y(0) =0 requires A = 0 y = sin’ =0 at x = 1 requires VA = no 


The eigenfunction is y(x) = sin n7z. The eigenvalue is \ = n?7? forn = 1, 2,3,... 
Then —y” = Ay. We have infinitely many y and ), not surprising since S = —d?/dzx? is not 
a matrix. It is an “ operator” and it acts on functions y(z). 
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The Second Derivative —d? /dx? is Symmetric Positive Definite 


The derivatives Ay = dy/dx and Sy = —d?y/dz? are linear operators. The first derivative 
A is antisymmetric. The second derivative S is symmetric. S is also positive definite, 
because of that minus sign. Its eigenvalues \ = n?7? are all positive. 

We will use the symbols AT and S7, even though A and S are not matrices. To give 
meaning to AT = —A and ST = S, we need the inner product (f,g) of two functions : 


Inner product of f and g (f(x), g(x)) = [f@ g(x) dz. (5) 


This is the continuous form of the dot product u-v = uTv of two vectors. For u- v 
we multiply the components u; and v;, and add. For functions we multiply the values 
of f(x) and g(x), and then integrate as in (5). 

A matrix is symmetric if Su-v equals w- Sv for all vectors. Then (Stu)?v = ut (STv) 
agrees with u? (Sv). An operator is symmetric if (S'f, g) equals (f, Sg) for all functions that 
satisfy the boundary conditions. Use two integrations by parts to shift the second derivative 
operator S' from f onto g: 


Integration r 7 i 2 
es 7 ax = [ Gf dg 4 = i d*g 
by parts [- 7 baer dz = | f(z) = dz. (6) 
0 0 0 


twice 


The integrated terms [g df /dx]} and [f dg/dz]§ in the two integrations by parts are zero 
because f = g = 0 at both endpoints. 
The left side and right side of (6) are the inner products (Sf, g) and (f, Sg). Moving S 
from f onto g always produces S?. Here we have S = ST and symmetry is confirmed. 
Thus the second derivative S = —d?/dzx? is symmetric positive definite (this is why we 


included the minus sign). Section 7.2 gave two other tests, in addition to positive eigenvalues. 
One test is positive energy, and that test is also passed. Choose g = f: 


1 
df \? 
oye TS = aa 
Positive energy f° Sf (Sf, f )- | -# al de = | ( +) dz>0. (7) 
0 
Zero energy requires df/dz = 0. Then the boundary conditions ensures f(x) = 0. 


The third test for a positive definite S looks for A so that S = ATA. Here A is the first 
derivative (Af = df/dzx). The boundary conditions are still f(0) = 0 and f(1) = 
Problem 1 will show that ATg is —dg/dz, with a minus sign from one integration by 
parts. Altogether S = —d*/dx? = (—d/dx)(d/dx) = ATA. 
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Solving the Heat Equation 


Differential equations in time give a chance to use all the eigenfunctions sin (nz). 
An outstanding example is the heat equation 0u/Ot = 07u/Ox? = —Su. The 
eigenvalues of —S are —n*7?, and the negative definite —S leads to decay in time and 
not growth. Temperatures die out exponentially when there is no fire. Here are the 
two steps (developed much further in Section 8.3) to solve the heat equation uz = Uzz: 


1. Write the initial function u(0, x) as a combination of the eigenfunctions sin n7z : 


Fourier sine series ustart = 61 sina + bosin2a7z+-+--+brsinntx+--- (8) 
2. With \ = —n?7?, every eigenfunction decays. Superposition gives u at time t: 
[oe 
Bea p ae SAR? iT BZ ps 
u(t, 2) = be * sinwa + boe *” * sin2aa+--- = pobaen" ™tsinnaax (9) 


at 


This is the famous Fourier series solution to the heat equation. Section 8.1 will show 
how to compute the Fourier coefficients b;, b2,... (a simple formula even when there 
are infinitely many 6’s). You see how the solution is exactly analogous to 
y(t) = exe ***a1 + cge—*2* x2. That solves an ODE, the heat equation is a PDE. 


Second Difference Matrix Kk 


These pages will take a crucial first step in scientific computing. This is where differential 
equations meet matrix equations. The continuous problem (here continuous in x, previ- 
ously in t) becomes discrete. Chapter 3 took that step for initial value problems, starting 
with Euler’s forward difference y(t + At) — y(t). Now we have problems —y” = f(z) 
with second derivatives. So we use second differences y(x + Ax) — 2y(x) + y(x — Az). 


The second derivative is the derivative of dy/dxz. The second difference is the 
difference of Ay/Az. For first differences we have choices—forward or backward or 
centered differences. To approximate the second derivative Sy = —y’ there is one 
outstanding centered choice. This uses the tridiagonal second difference matrix K : 


d?y KY rae Y, 

ter 2 ae , 

dz (Az)? =i each 2 
KY = —1 ‘ : F (10) 

—1 2 —1 from : pol d 

= 2 

HY 47 $2Y¥3 = Yui Yn 

The numbers Y; to Yy are approximations to the true values y(Az),...,y(NAz) 


in the continuous problem. The boundary conditions y(0) = 0 and y(1) = 0 become 
Yo = 0 and Yy4, = 0. The step Az has length 1/(N + 1). The matrix K correctly 
takes Yo and Yy41 to be zero, by working only with Y; to Yu. 
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The Matrix K is Positive Definite 


We know that the operator S = —d?/dx? is positive definite. All of its eigenvectors 
sin nwa have positive eigenvalues 41 = n?m?. So we hope that the matrix K is also 
positive definite. That is true—and most unusually for a matrix of any large size N, 
we can find every eigenvector and eigenvalue of Kx. 

The eigenvectors are the key. It doesn’t happen often that sampling the continuous 
eigenfunctions at N points produces the discrete eigenvectors. This is the most important 


example in all of applied mathematics, of this unprecedented sampling for y = sinn7az: 


The N eigenvectors of K are y,, = (sinntAz,sin2n7Az,...,sinNnzAz). (11) 
(12) 


The WN eigenvalues of K are the positive numbers \,, = 2 — 2.cos nae ; 
The 2 in every eigenvalue \ comes from the 2’s along the diagonal of K (that diagonal 
is 2/). The cosine in \ and in the equation Ky,, = Any, are checked in Problem 12. 
All eigenvalues are positive because the cosines are below 1. Then kc is positive definite. 

It is natural to try the other positive definite tests too (we don’t have to do this, 
d > 0 is enough). With a rectangular first difference matrix A, we have K = ATA: 


1 -1 ain. 4 2 -1 
ATA=K be a. a4 ae ea) es) 
1 -1 = -1 2 


The three columns of that matrix A are certainly independent. Therefore A™ A is a positive 
definite matrix, now proved twice. 

Notice that AT is minus the usual forward difference matrix. A is plus a backward 
difference matrix. That sign change reflects the continuous case (for derivatives) where 
the “transpose” of d/dx is — d/dx. For every vector f, the energy f' Kf is the same as 
fT ATAF = (Af) (Af) > 0: 


1 df \? N+1 
The energy / (J) dx becomes f' Kf =(Af)'(Af) = oa (fr—fn—1)? > 0. 
(0) n=1 


The test of positive energy f | K f is passed, and K is again proved to be positive definite. 


Boundary Conditions on the Slope 


The fixed-fixed boundary conditions are y(0) = 0 and y(1) = 0. One or both of those 
conditions can change to a slope condition on y’ = dy/dz. If the left condition changes 
to y'(0) = 0, the top of our elastic bar is free instead of fixed. This is like a tall 
building; z = O is up in the air (free) and x = 1 is down at the ground (fixed). 
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A fixed-free hanging bar combines y(0) = 0 at the top with y’(1) = 0 at the bottom. 
Its matrix is still positive definite. But a free-free bar has no supports: semidefinite ! 


= dy yeah ae ot dy as =! 
Free-free Sy = f ee = f(x) with aa OO =0 and aril) =0. (14) 


You will see that this problem generally has no solution. One eigenvalue is now A = 0. 


d 

Free-free Sy=Ay = — —2 = dy(z) with oe =0 atx=0 and r=1. (15) 
The fixed-fixed problem had eigenfunctions y(x) = sin nz and eigenvalues \ = n?7?. This 
free-free problem will have y(x) = cos naz and again \ = n?7?. Those cosines start and 
end with zero slope. Also very important: The free-free problem has an extra eigenfunction 
y = cos Ox (which is the constant function y = 1). And then \ = 0: 


d2 
Constant y and zero A y=1 solves — = = ry with eigenvalue 4 = 0 
x 


Conclusion: The free-free problem (14) is only positive semidefinite. The eigenvalues 
include \ = 0. The problem is singular and for most loads f (a) there is no solution. 


Example with f(z) = x Show that —y” = z has no solution with y’(0) = y’(1) = 0. 


Solution Integrate both sides of —y” = x from z = Oto x = 1. The right side gives 
fvdx = §. The left side gives — fy dx = y’(0) — y/(1). But the boundary conditions 
make this zero and there can be no solution to 0 = 5. An operator with a zero eigenvalue is 
not invertible. 


Free-free Difference Matrix B 


This problem —y” = f(x) with free-free conditions y’(0) = y/(1) = 0 leads to a singular 
matrix (not invertible). This is still a second difference matrix, to approximate the second 
derivative. But row 1 and row N of the matrix are changed by the free-free boundary 
conditions : 


Free-free matrix B 1 -1 
Change Ki1= 2 to Bi =1 B=|~- E : _ | isnot invertible. 
Change Kynp= 2 to BNN = A 1 


The slope dy/dz is approximated by a first difference in row 1 and row N. All other rows 
still contain the second difference —1,2,—1. The usual 1, —2, 1 has signs reversed because 
the differential equation has —d?y/dz?. 

How to see that B is not invertible ? MATLAB would find pivots 1,1,...,1,0 from 
elimination. The zero in the last pivot position means failure. We can see this failure directly 
by solving By = 0. This is the fast way to show that a matrix is singular. 

To show that B is not invertible, find the constant solution to By = zero vector. 
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1 -l iL 0 

y = constant vector =f oP eT 1 0 
B = singular matrix ‘es —l ae 1 0 (16) 

—1 1 1 0 


If B~! existed, we could multiply By = 0 by B~! to find y = O. But this y is not zero. 


B is positive semidefinite but it is not positive definite. We can still write the matrix 
Bas A’ A, but in this free-free case the columns of A will not be independent. 


i dS 
ee 
—1 a —] 1 
ae ky a ae 
B=ATA es iw - bea 


—l 2 —] 


With only 3 rows, the 4 columns of A must be dependent. They add up to a zero column. 


= REVIEW OF THE KEYIDEAS #8 
1. Two initial conditions for y(0) and y’(0) can change to two boundary conditions. 
2. The fixed-fixed problem —y” = Ay with y(0) = 0 and y(1) = Ohas \ = n?n?. 
3. The second difference matrix Kv has \,, = 2 — 2 cos x44; > 0. Positive definite. 
4. Eigenfunctions and eigenvectors are sines, from fixed-fixed boundary conditions. 


5. The free-free problem with y’(0) = y’(1) = 0 has y = cosines. This allows \ = 0. 


6. The free-free matrix B has \ = 0 with the eigenvector y = (1,..., 1). Semidefinite. 


Problem Set 7.3 


1 Transpose the derivative with integration by parts: (dy/dx,g) = —(y,dg/dz). 


Ay is dy/dx with boundary conditions y(0) = 0 and y(1) = 0. Why is f[ y’gdx 
equal to — f yg’dx? Then A™ (which is normally written as A*) is A'g = —dg/dz 
with no boundary conditions on g. AT Ay is —y” with y(0) = 0 and y(1) = 0. 
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Problems 2-6 have boundary conditions at x = 0 and x = 1: no initial conditions. 


2 Solve this boundary value problem in two steps. Find the complete solution y, + Yn 
with two constants in y,, and find those constants from the boundary conditions : 


Solve —y” = 12x? with y(0) = 0 and y(1) = 0 and y, = —2*. 
3 Solve the same equation —y” = 12x? with y(0) = 0 and y’(1) = 0 (zero slope). 


4 Solve the same equation —y” = 122? with y’(0) = 0 and y(1) = 0. Then try for 
both slopes y/(0) = 0 and y’(1) = 0: this has no solution y = —x* + Ax + B. 


5 Solve —y” = 6z with y(0) = 2 and y(1) = 4. Boundary values need not be zero. 
6 Solve —y” = e* with y(0) = 5 and y(1) = 0, starting from y = yp + Yn. 


Problems 7-11 are about the LU factors and the inverses of second difference matrices. 


7 The matrix T with T;,; = 1 factors perfectly into LU = A™ A (all its pivots are 1). 


te =1 1 dpa eat 
—1 21 =1 1 I * e 
oe —1 2. 0Sh |= al 1 heat ace 
—1 2 lhe ol: 1 


Each elimination step adds the pivot row to the next row (and L subtracts to recover 
T from U). The inverses of those difference matrices L and U are sum matrices. 
Then the inverse of T = LU is U~'L7!: 


a gs ie 
Ts : : : =U EH", 
1 


i tt 
eee 


1 
TS th 


Compute T'~! for N = 4 (as shown) and for any N. 


8 The matrix equation TY = (0, 1,0,0) = delta vector is like the differential equation 
-y" = 6(x — a) witha = 2Ax = 2. The boundary conditions are y/(0) = 0 and 
y(1) = 0. Solve for y(x) and graph it from 0 to 1. Also graph Y = second column of 


T~' at the points 2 = 4, 2, 3, 4. The two graphs are ramp functions. 


5959595 


9 The matrix B has By, = 1 (like T\; = 1) and also By y = 1 (where Ty ny = 2). Why 
does B have the same pivots 1, 1, ... as J, except for zero in the last pivot position ? 
The early pivots don’t know Byyn = 1. 


Then B is not invertible: —y’’ = 5(a — a) has no solution with y/(0) = y/(1) = 0. 
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10 


11 


12 


13 


14 


15 


When you compute K~!, multiply by det K = N + 1 to get nice numbers : 


Column 2 of 5K~? solves the equation Kv = 56 when the delta vector is 6 = 
We know from K K~1 = I that K times each column of K ~! is a delta vector. 


43 2 1 
SK! = : : > & graph of 
—_— : : column 2 


K comes with two boundary conditions. T’ only has y(1) = 0. B has no boundary 
conditions on y. Verify that K = ATA. Then remove the first row of A to get 
T = A] A;. Then remove the last row to get dependent rows: B = Af Ao. 


1 
—1 1 
—1 1 
—1 


The backward first difference A = gives K = ATA. 


Multiply K3 by its eigenvector y,, = (sinn7h, sin 2n7h,sin3n7h) to verify that 
the eigenvalues Ai, A2, A3 are An = 2— 2cos*t in Ky, = AnYn. This uses the 


trigonometric identity sin(A + B) + sin(A — B) = 2sin Acos B. 


Those eigenvalues of K3 are 2 — /2 and 2 and 2 + \/2. Those add to 6, which is 
the trace of K3. Multiply those eigenvalues to get the determinant of K3. 


The slope of a ramp function is a step function. The slope of a step function is a delta 
function. Suppose the ramp function is r(x) = —x forx < O and r(x) = 2 forz > 0 
(so r(x) = |a|). Find dr /dx and d?r/dx?. 


Find the second differences yn41 — 2Yn + Yn—1 Of these infinitely long vectors y: 


Constant Ges iste) 
Linear (2505 =150515-25352.3) 
Quadratic (...,1,0,1,4,9,...) 

Cubic (eG H1;051;,8)- 274222) 
Ramp (:255;.0;.0;0515 25.) 
Exponential (...,e~”, e°, e”,e7,...). 


It is amazing how closely those second differences follow second derivatives for 
y(x) = 1,2, 7, 2°, max(z,0), and e*”*. From e*”” we also get coswz and sinwz. 
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7.4 Laplace’s Equation and AT A 


Section 7.3 solved the differential equation —d?y/dx? = 6(x — a). Boundary values 
were given at x = 0 and x = 1 (our examples began with y = 0 at both endpoints). 
The solutions y(x) went linearly up from zero and linearly back to zero. These boundary 
value problems correspond to a steady state—with no dependence on time. 


Those are “1-dimensional Laplace equations”—certainly the simplest of their kind. 
This section is more ambitious, in three important ways: 


1 We will solve the 2-dimensional Laplace equation—our first PDE. The list of solu- 
tions is infinite, and they are particularly beautiful. Amazingly the imaginary number 
i = /—1 enters this real problem. 


a a? 
Laplace’s partial differential equation Pe ieee res 0 (1) 
Ox? Oy? 


2 The discrete form of (1) is a matrix equation for a vector U. That vector has 
components U;,...,U,, at the n nodes of a graph. The graph could be a line in 1D 
or a grid in 2D, or any network of nodes connected by m edges (Figure 7.5). 


yA 
line grid 
n=A4 n= 16 
m=3 m = 24 x network n=4 m=6 


Figure 7.5: A 1D line graph, a 2D grid, and a complete graph: m nodes and m edges. 


The natural discrete analog of Laplace’s equation (1) is a “5-point scheme” on a grid: 


A2U 4 AjU _ 2nd difference across grid 0 (2) 
(Ar)? © (Ay)? +2nd difference down grid — 


For these equations we are given boundary values of w and U. Instead of an interval 
like 0 < az < 1, there is a region in the plane: uw is given along its boundary. U is 
given at the 12 boundary points of the 4 by 4 grid. Equation (2) holds at each inside point. 


3 The continuous and discrete Laplace equations are good examples of A? Au. 
A™ A is symmetric with eigenvalues \ > 0. And one more matrix will produce ATCA 
in Section 7.5. In engineering, C' contains the physical properties of the material: stiffness 
and conductivity and permeability. You will be seeing the structure of applied mathematics. 
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Laplace’s Equation is AtAu = 0 


This is our first partial differential equation. It represents equilibrium, not change. 


1 3 ° ( ) 0 Uu 0 Uu ( ) 
Lap ace’s equation for UL —-—~ - — =0 3 
°y Ox2 Oy? 


I have included minus signs to make the left side into A™ Au. In one dimension, A 
was d/dx and AT was —d/dx. Now we have two space variables x and y, and two partial 
derivatives 0/Ozx and 0/Oy will go into A. Then —0/Ozx and —0/Oy go into AT. 

The vector Au has two components 0u/Ox and Ou/Oy. This is the “gradient vector’ 
We are into the 2D world of multivariable calculus and partial derivatives : 


Gradient of u Au = grad u(x, y) = ne | = Ree ; (4) 


> 


I will skip double integrals and the Divergence Theorem (which is the 2D form of the 
Fundamental Theorem of Calculus). Since A is 2 by 1, you can guess that AT is 1 by 2: 


a 2a fey |--= ee 5) 


Divergence ATw=—divw=|—-— —— ——- 
. | az By | | wa(z,y) Be Oy 


Then A’ Au is (minus) the divergence of the gradient of u(x, y). This is the Laplacian: 


ATAu=-— divgradu AtAu= |- = -——-— —— | 6) 


You recognize AT Au = 0 as Laplace’s equation. With zero on the right hand side, the 
minus sign can be included or not. We usually give Poisson’s name when the equation has a 
nonzero source (or a sink) f(z, y) on the right hand side: 


Una + Uyy = f(x,y) is Poisson’s equation. 


The subscripts in wzz and wyy indicate second partial derivatives: uzg = 0?u/Ox? and 
Uyy = 0?u/dy? . In this notation, uz indicates 0u/Ot. Previously that was u’, in the 
ordinary differential equations of earlier chapters. PDEs bring these new notations. 


Example 1 u = zy solves Laplace’s equation uzz + Uyy = O. And u, = x? + y? 
solves Poisson’s equation Uzz + Uyy = 4 with a constant source. The complete solution 
for Poisson is this particular solution x? + y? plus any null solution for Laplace. 
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Solutions to Laplace’s Equation 


We want a complete set of solutions to Uae + Uyy = 0. The list will be infinitely long. 
Combinations of those solutions will also be solutions. Laplace’s equation is linear, so 
superposition is allowed. Four solutions are easy to find: wu = 1,z,y,zy. 
For those four, ugz and uy, are both zero. To find further solutions, we need uzz to 
cancel tyy. 


Start with wu = 2”, which has uzz = 2. Then uy, = —2 is achieved by —y?. 
The combination u = x” — y? solves Laplace’s equation. This solution has “degree 2” 
because if x and y are multiplied by C, then u is multiplied by C?. The same was true 
of u = zy, also degree 2 because (C'x)(Cy) is C? times xy. 


The real question starts with x?. Can this be completed to a solution of degree 32 
From u = x? we will have uz, = 6x. To cancel 62, we need a piece that has uyy = —6z. 
That piece is —3xy”. The combination u = x* — 3xy? has degree 3 and goes into our list. 


The hope is to find two solutions of every degree. Here is the list so far. I will write each 
pair of solutions in polar coordinates too, starting with u = x = rcos9@. 


r sin@ 


degree 2 r cos 20 r? sin 20 
degree 3 2? r? cos 30 r? sin 30 


On the polar coordinate list, the pattern is clear. The pairs of solutions to Laplace’s equation 
are r™ cos nO andr” sin n@. Those will be solutions also for n = 4, 5,... 

The first list (pairs of x, y polynomials) also has a remarkable pattern. Those are the 
real and imaginary parts of (x + iy)”. Degree n = 2 shows the two parts clearly : 


(x + iy)? is 2? — y?+i2aey This is (rei?)” = re — 72 cos 20 + ir? sin 20. 


The polar pair r” cosn@ and r” sin né satisfy Laplace’s equation for every n. The x-y pair 
succeeds because wy, includes i? = —1, to cancel uz. We have two solutions for each n: 


Degreen un=Re(x+iy)” =r"cosn@ s,=Im(2+ iy)” =r"sinnéd. (7) 


All combinations of these solutions will also solve Laplace’s equation. For ordinary 
differential equations (second order with y’’), we had two solutions. All null solutions were 
combinations ciyi1 + cey2. By choosing c; and cz we matched the two initial 
conditions y(0) and y’(0). Now we have a partial differential equation with an infinite 
list of solutions, two of each degree. 
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By choosing the right coefficients a, and b, for every n, including the constant ao, 
we can match any function u = uo(z, y) around the boundary : 


On the boundary uo (x,y) = ao + air + byy + a2(x? — y?) + bo(2ry) + - 
Circular boundary —_uo(1, 0) = ap + a cosO + bj sind + ag cos 26 + be sin26 + + 


That last sum is a Fourier series. It enters when we solve Laplace’s equation inside a circle. 
The boundary condition u = ug is given on the circle r = 1. For 1D problems the boundary 
was the two endpoints x = 0 and x = 1. We only needed two solutions. 

The right choice of all the Fourier coefficients a, and b, will come in Chapter 8, 
and it completes the solution to Laplace’s equation inside a circle: 


co 
Solution to tra + Uyy =O w=an+)) (anr"™cosnd+brr”sinn@). (8) 
n=1 


Finite Differences and Finite Elements 


Laplace’s equation is often made discrete. The derivatives uz, and uy, are replaced 
by finite differences. That produces a large matrix K2D, which is a two-dimensional 
analog of the tridiagonal —1, 2, —1 matrix K. For the square grid in Figure 7.5, there will 
be entries —1, 2, —1 in the x-direction and also in the y-direction. K 2D has five entries: 
2 + 2 = 4 down its main diagonal and four entries of —1 on a typical inside row. 

Suppose the region is not square but curved (like a circle). Then finite differences 
get complicated. The nodes of a square grid don’t fall on circles. The favorite approach 
changes to the finite element method, which can divide the region into triangles of 
arbitrary shapes. (A triangle can even have a curved edge to fit a boundary.) These 
finite elements are described in my textbook Computational Science and Engineering, 
with codes that use linear functions a + bx + cy inside each triangle of the mesh. 
The accuracy is studied in An Analysis of the Finite Element Method. 


Laplace’s Difference Matrix kK 2D 


The approach that fits with this book is finite differences. I want to construct the symmetric 
matrix K2D with rows like —1,—1,4, —1, —1 and show that it is positive definite. K2D 
comes from second differences in the x and y directions. Each meshpoint needs two indices 
z and 7, to specify its row number and column number on the grid. Go across and up-down: 


2 
_ PU comes aUitid + 2Uij -Uing _ OU Uaits + ig — Vig 
Ox? (Ax)? dy? (Ay)? 


The square grid has Ax = Ay. Combine 2U;,; with 2U;,;. Then 4 goes on the diagonal of 
K2D. The difference equation says that each U;; is the average of its 4 neighbors: 


AZU + AjU =0 4U;,; — Ui4i,3 — Ui-1g — Ui,j41 — Uij-1 = 0. WO 
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If a neighbor of the 2,7 node falls on the boundary of the square grid, that boundary 
value of U will be known. Then that term moves to the right side of the difference equation. 
An entry of —1 disappears from K 2D on boundary rows. 

If we number the nodes a row at a time, the uz, term puts the 1D matrix K in each 
block row. The uyy term connects three rows with —J and 2/ and —I. 


K OF =F 
K2D = aan +) WF PE WE | akon (I,K) + kron (K, 1). 
K aE aE 


With N interior points in each row, this block matrix K2D is N? by N?. MATLAB’s 
command kron(A, B) replaces each A;; by the block A;; B, so the size grows to N?. 

Here is the matrix for a grid with 3 x 3 = 9 squares and 4 x 4 = 16 nodes. There are 
2 x 2 = 4 interior nodes. The other 16 — 4 = 12 nodes are around the square boundary, 
where U is given by the boundary condition u = ug. For a large grid, N? interior points will 
far outnumber 4 + 4 boundary points. 


Laplace difference matrix K2D = ais 21. -:0 
The interior mesh is 2 by 2 _ 0 -1 4 -1 


Those rows lost two —1’s because each interior gridpoint is next to two boundary points. 
Normally we see four —1’s in almost every row of K 2D. 
Here is the solution to K2DU = 0 in the square when boundary values are 0 and 4: 


Each bold value of U is 
the average of 4 neighbors 


The eigenvalues of this matrix K2D are \ = 2,4, 4,6. They add to 16, which is the trace: 


the sum down the diagonal of K2D above. The eigenvectors are orthogonal: 


Eigenvectors of kK2D (1,1,1,1) and (1,1, —1, —1), (1, —1, 1, —1) and (1, -1, -1, 1). 


Symmetry of K2D guaranteed orthogonal eigenvectors. Positive definiteness produced 
those positive eigenvalues 2, 4, 4, 6. 
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Eigenvalues of the Laplacian : Continuous and Discrete 


In one dimension, the eigenfunctions for —u,, = Au are u = sinnwz with eigenvalue 
X = n?7x?. These sine functions are zero at the endpoints x = 0 and x = 1. Ona unit 
square in two dimensions, the eigenfunctions of the Laplacian are just products of sines: 
u(x, y) = (sinn7z)( sin mzry) with eigenvalue X = n22x? + m?7?. Those functions 
are zero on the whole boundary of the square, where x = ODorz = lory=Oory=1: 


2 2 
= (53 ae $a) (sin naz) (sin my) = (n22? + m?2?)(sinnz)(sin my). | (10) 


The problem on a square allows separation of variables. Each of the eigenvectors is a 
(function of x) times a (function of y). Two 1D problems, just what we hope for. 
Equation (6) expressed —uzz — Uyy as — div(grad u). This is ATA(A= gradient). 
The test \ > 0 is passed on non-square regions too, when the z, y variables don’t separate. 
Slope conditions (a derivative of u is zero instead of the function itself) allow the 
constant eigenfunction u = 1. Then \ = 0 and the Laplacian becomes semidefinite. 


Turn now to the matrix Laplacian K 2D. In one dimension, the eigenvectors of K are 
discrete sine vectors: Sample the continuous eigenfunction sinnaaz at N equally spaced 
points. The spacing is Az = 1/(N + 1) inside the interval from 0 to 1. The eigenvalues 
of K are A, = 2 — 2 cos(naAz). We may hope and expect that the eigenvectors of K2D 
will contain products of sines, and the eigenvalues will be sums of 1D eigenvalues A(K). 


The N? eigenvalues of K2D are positive. The x and y directions still separate. 


nT 


MT 
ig SO fl 
Nite 2 Wee aD 


Anm(K2D) = An({K) + Am(K) = 4— 2 cos 


Thus K2D for a square is symmetric positive definite. This formula for the eigenvalues 
recovers \ = 2,4, 4,6 when N = 2, because the cosines of % and % are 4 and —3. 


= REVIEW OF THE KEY IDEAS #8 
1. Laplace’s equation is solved by the real and the imaginary part of every (x + iy)”. 
2. Those are u = r” cos n@ and s = r™ sin n8@. Their combinations are Fourier series. 
3. The discrete equation is AZU + A?U = 0. The matrix K 2D is positive definite. 


4. Eigenvectors are (sines in x) (sines in y): —Uzz — Uyy = Au and (K2D)U = XU. 
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Problem Set 7.4 


What solution to Laplace’s equation completes “degree 3” in the table of pairs of 
solutions ? We have one solution u = x? — 32y?, and we need another solution. 


What are the two solutions of degree 4, the real and imaginary parts of (x + iy)*? 
Check wrx + Uyy = 0 for both solutions. 


What is the second x-derivative of (2 + iy)” ? What is the second y-derivative ? 
Those cancel in uz2 + Uyy because i? = —1. 


For the solved 2 x 2 example inside a 4 x 4 square grid, write the four equations (9) 
at the four interior nodes. Move the known boundary values 0 and 4 to the right hand 
sides of the equations. You should see K2D on the left side multiplying the correct 
solution U = (Ui1, Uj2, U1, U2) = (lL; 2: 2) 3). 


Suppose the boundary values on the 4 x 4 grid change to U = 0 on three sides and 
U = 8 on the fourth side. Find the four inside values so that each one is the average 
of its neighbors. 


(MATLAB) Find the inverse (K2D)~! of the 4 by 4 matrix K2D displayed for the 
square grid. 


Solve this Poisson finite difference equation (right side # 0) for the inside values 
Ui1, U12, U21, U22. All boundary values like Ujg and Uj3 are zero. The boundary 
has 7 or 7 equal to 0 or 3, the interior has 7 and j equal to 1 or 2: 


4U;; — Ui-1,5 — Visi; Ups Ui,j41 = 1 at four inside points. 


A 5 x 5 grid has a 3 by 3 interior grid: 9 unknown values U;; to U33. Create the 
9 x 9 difference matrix K 2D. 


Use eig( 42D) to find the nine eigenvalues of A 2D in Problem 8. Those eigenvalues 
will be positive ! The matrix K 2D is symmetric positive definite. 


If u(x) solves uzz = O and v(y) solves vy, = 0, verify that u(x)u(y) solves 
Laplace’s equation. Why is this only a 4-dimensional space of solutions ? Separation 
of variables does not give all solutions—only the solutions with separable boundary 
conditions. 
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7.5 Networks and the Graph Laplacian 


Start with a graph that has nm nodes and m edges. Its m by n incidence matrix A was 
introduced in Section 5.6, with a row in the matrix for every edge in the graph. 
A single —1 and 1 in the row indicates which two nodes are connected by that edge. 
Now we take the step to L = ATA and K = ATCA. These are symmetric positive 
semidefinite matrices that describe the whole network. 

Those matrices L and K are the graph Laplacians. LD is unweighted (with C = 1) 
and K is weighted by C’. These are the fundamental matrices for flows in the networks. 
They describe electrical networks and their applications go very much further. You see 
A™A and ATCA in descriptions of the brain and the Internet and our nervous system and 
the power grid. 

Social networks and political networks and intellectual networks also use LZ and K. 
Graphs have simply become the most important model in discrete applied mathematics. 

This is not a standard topic in teaching linear algebra. But it is today an essential topic in 
applying linear algebra. It belongs in this book. 


Examples of A and ATA 


We quickly review incidence matrices, by constructing A for the planar graph and the line 
graph in Figure 7.6. You will see that every row of A adds to —1 + 1 = O. Then the all-ones 
vector v = (1,...,1) leads to Av = O. The columns of A are dependent, because their 
sum is the zero column. Av = 0 propagates to ATAv = O and ATC Av = 0, so ATCA 
for this A will be positive semidefinite (but not invertible and not positive definite). 


© Incidence matrix 
@ —1 1 
6) 


4 Ajine = =] 1 


Figure 7.6: A planar graph and a line graph: n = 4 nodes and m = 5 or 3 edges. 


Aline is a 3 by 4 difference matrix. Then ATA below contains second differences. 
Notice that the first and last entries of ATA are 1 and not 2. The diagonal 1, 2, 2, 1 
counts the number of edges that meet at each node (the “degrees” of the four nodes). 


Av = difference of v’s ae = = aoe w= 


Av = | v3— v2 ATA= 


ATA = line Laplacian ras cig 
; 0 0 -1 1 
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For the planar graph, the incidence matrix A again computes differences Vena—Ustart 
on every edge. The Laplacian matrix L = A‘A again has rows adding to zero. The 
diagonal of L shows 3,3, 2,2 edges into the four nodes. Everything in A and L can be 
copied directly from the graph! The missing pair of —1 entries in L = A™A is because 
no edge connects nodes 3 and 4 on the 5-edge graph. 


i moar aes ees ea 

Incidence matrix 19 -] Seas 
A=| 0 -1 1 0| ATA= (2) 

Laplacian matrix ie SG eee = = he 42), 30 

deeb 200A ce oy ne 


Note If any arrows change direction on the edges of the graph, this changes A. But 
A™A does not change. The direction of arrows just multiplies A by a + diagonal sign 
matrix S. Then (SA)"(SA) is the same as AT A because STS = J. 

The eigenvalues of L = A? A always include A = 0, from the all-ones eigenvector. 
The energy v?(A™ A)v can also be written as (Av)"(Av). This just adds up the squares of 
all the entries of Av, which are differences across edges (not the missing edge from 3 to 4): 


Energy = (v2 — v1)” + (v3 — v1)? + (v3 — va)? + (va — 11)? + (v4 — v2)”. 


We see again that the all-ones vector v = (1,1, 1,1) has zero energy. 

The Laplacian matrix L = A7 A is not invertible! A system of equations ATAv = f 
has no solution (or infinitely many). To reach an invertible matrix, we remove the last 
column and row of A™ A. This corresponds to “grounding a node” by setting the voltage at 
that node to be zero: v4 = O. It is like fixing one temperature at zero, when the equations 
only tell us about differences of temperature. 

When we know that vg = 0, column 4 is removed from A. That removes column 4 
and also row 4 from A’ A. This reduced 3 by 3 matrix is positive definite : 


3-1 -l 

(AT A) reduced — —1 3 -l = (Avene) CAtageea) = (3 by 5) (5 by 3). (3) 
il) SS 7 2 

The Weighted Laplacian kK = A'CA 

In many applications the edges come with positive weights ci,...,Cm. Those weights can 


be conductances (through m resistors) or stiffnesses (of m springs). In electrical 
engineering, Ohm’s Law connects current w to voltage difference e. In mechanical 
engineering, Hooke’s Law connects spring force w to the stretching e. Those laws w = ce 
in every edge give a positive diagonal matrix C in w = Ce = C'Av. The m currents in 
w come from the m voltage differences in Av. 

Kirchhoff’s Current Law is AT w = 0. That matrix AT always enters the “balance 
of currents” and the “balance of forces” between springs. With current sources, or forces 
applied from outside, the balance equation is AT w = f. 
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When current sources enter the nodes, the Current Law ATw = f is “in equals out.” 
Then A?Ce = f and A'CAv = f. Thus K = ATCA is the conductance matrix for 
the whole network. Here is ATCA for the line of resistors : 


ATw =f (Kirchhoff) o i ‘ 
A™TCe =f (Ohm) (ATCA)tine = | 0 = ae a = we 
ATC Av =f (System) 0 0 C3 c3 


The rows of ATCA still add to zero. The matrix is still positive semidefinite. It becomes 
positive definite when row and column 4 are removed, which we must do to solve 
ATC Av = f. This is a fundamental equation of discrete applied mathematics. 

A network can also have voltage sources (like batteries) on the edges. Those go into a 
vector 6 with m components. From node to node the voltage drops are — Av (with a minus 
sign). But Ohm’s Law applies to the voltage drops e across the resistors. By working with 
the matrix C’ and including 6 in the vector e = b — Av, Ohm’s Law is simply w = Ce. 
The inputs to the network are f and b. 

The three equations for e, w, f use the matrices A,C, A™. Those become two 
equations by eliminating e = C~1w. We reach one equation by also eliminating w. 


3 equations 2 equations 1 equation 


Drop e 


Current w c® = ATCAv = A™Ch—f 
Balance f = 


I removed e by substituting e = C~!w into the first equation. The step from two equations 
to one equation substituted w = C(b — Av) into f = ATw. Almost all entries of A 
and C will be zero. The weighted graph Laplacian is K = A™CA. 

You see how the sources b and f produce the right side. They make the currents flow. 


A Framework for Applied Mathematics 


The least squares equation ATAv = A™b and the weighted least squares equation 
ATCAv = A'™Cb are special cases with f = O. My experience is that all the 
symmetric steady state problems of applied mathematics fit into this ATCA framework. 


Voltage Law — A Ohm’s Law —> C Current Law — AT 


I have learned to watch for ATCA in every lecture about applied mathematics: it is 
there. Differential equations fit this framework too. Laplace’s equation is AT Au = 0 when 
Au is the gradient of u(x,y). A typical ATCA equation is —d/dz(cdu/dz) = f(z). 

For matrices, those derivatives become differences. The graph analogy with Laplace’s 
equation gave the name graph Laplacian to the matrix AT A. 
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Dynamic problems have time derivatives du/dt. This adds a new step to the ATCA 
framework. The equation du/dt = —A™ Au is a matrix analog of the heat equation 
Ou/Ot = 0?u/Ax?. The next chapter will solve the heat equation using the eigenvalues 
and eigenfunctions (sines and cosines) from y’’ = Ay. The solutions are Fourier series. 


Example: A Network of Resistors 


I will add resistors to the five edges of our four-node graph. The conductances 
1/R will be the numbers c; to cs. The conductance matrix for the whole network is ATCA. 
The incidence matrix A in equation (2) above is 5 by 4, and ATCA is 4 by 4. 


Conductance C1 + C2 + Ca =i = 9 04 

matrix K with A'CA= ics le aa arn fa (5) 
"C2 63 co + ¢3 0 

five edges ter ae 0 Ptr 


Please compare this matrix to ATA in equation (2), where all c; = 1. The new matrix 
starts with c; + cz + c4 because edges 1, 2, 4 touch node 1. Along that row of K, the entries 
—c1, —C2, —C4 produce row sum = zero as we expect. Then A? CA is singular, not invertible. 
We must reduce the matrix to 3 by 3 by “grounding a node” and removing column 4 
and row 4. The reduced matrix is symmetric positive definite. 


Suppose the voltage v; = V is fixed, as well as v4 = O at the grounded node. Current 
will flow out of node 1 toward node 4 (with b = f = 0). The terms c,V and coV 
involving the known v1 = V move to the right hand side of A'CAv = 0. There are 
only two unknown voltages v2 and v3, and V is like a boundary value: 


(6) 


Reduced equations C+teo;t+c5 | 6 — C3 v2) || | eV 
v1 = V and v4 = O — C3 c2+ 3 v3. | | eae |” 


When we solve for v2 and v3, we know all four voltages v and all five currents w = C'Av. 
Summary 


The matrix C' changes an “ideal” A™A problem into an “applied” ATC'A problem. You 
will see how this three-step framework appears all through applied mathematics. 
Au is often a derivative of u, or a finite difference. Then C'Au comes from Ohm’s Law 
or Hooke’s Law. The material constants like conductance and stiffness go into C’. 


Finally ATC Av = f is a continuity equation or a balance equation. It represents 
balance of forces, balance of inputs with outputs, balance of profits with losses. The 
combined matrix K = A™CA is symmetric positive definite just like AT A. 


To find the forces or the flows inside the network, we solve for v and e and w. 
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The Adjacency Matrix 


The Laplacian matrices L = A?A and K = A‘'CA started with the incidence matrix A. 
The diagonal of L has the degree of each node: the number of edges that touch the node. 
A’ A also comes directly from the degree matrix D minus the adjacency matrix W : 


3 -l -1 -l 3 Oo 2 1, 4 

pa. | od 3. =1 Sh] _ 3 _ bh & % 4 
ao =a 25> OF 2 1 1. 0 8 @ 

ceri Cite) Coma 0) 2 2 11.0 8 


The degrees 3,3,2,2 in D are the row sums in W. Then D — W has zero row sums. 
When L = ATA = D — W multiplies (1, 1,1, 1) the result will be (0,0, 0,0). 


Question The sum of the degrees is 10. How can this be predicted from the graph? 


Answer The graph has five edges. Each edge produces two 1’s in the adjacency matrix. 
There must be ten 1’s in W. The degrees in D must add to 10, to balance the 1’s in W. 


Since the trace of Lis 3+ 3+ 2 + 2, the eigenvalues of LZ must also add to 10. 
Question What is the rule for W and D when there are weights c),..., Cm on the edges? 


Answer Each entry Wi; = 1 comes from an edge between node 7 and node 7. When 
this edge k has a weight cz (the conductance along the edge), the entry W;; changes 
from 1 to cy. The weights produce ATCA in equation (5) and also in equation (8). 


= Ci +ce2+c4 O cr co &% 
A‘*CA=K ; : ; 

: j D-W= a. cy BU C3 Cs (8) 
with weights : c2 cz 0 O 
C4 + C5 ca ce OF C0 


Problems 1 — 5 will ask about a complete graph, when every pair of nodes is connected 
by an edge. All off-diagonal entries in the adjacency matrix W are 1. All the degrees 
in the diagonal D are n — 1. The Laplacians LZ and K have no zeros. Every question about 
L = ATA = D—W has a good answer for this graph with all possible edges. 


Here is a picture that summarizes this three-step vision of applied mathematics. 


Voltages U1, ..-5 Un Current Law ATw = f 


e =b—Av 
A EAN ohet : AT w =Ce 
| A’* CA is the conductance matrix ~. SAtw 


C 
Voltage drops e = b— Av ——» Currents w = Ce | ATCAv — A'Cb-—f | 
Ohm’s Law 


Figure 7.7: The A?C'A framework for steady state problems in science and engineering. 
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Saddle-Point Matrix 


The final matrix is ATCA, after the edge currents w,,..., Wm are eliminated. Before we 
took that step, the voltages v and the currents w were the two unknown vectors. With 
two equations we have a “saddle-point matrix” that contains C~ and A and AT: 


Saddle-point problem | cc! A | w - | b | (9) 


Currents and voltages AT 0 v f 
Block matrices of this form appear when there is a constraint like Kirchhoff’s Current 
Law ATw = f. “Nature minimizes heat loss in the network subject to that constraint.” 
The “KKT matrix” in (9) is symmetric but it is not at all positive definite. 
A small example will show a positive and also a negative eigenvalue: 


4 
| ; ; has eigenvalues 4 and —1. The pivots are 3 and “ae 


Eigenvalues and pivots have the same signs! Multiply the eigenvalues or the pivots to 
reach the determinant —4. The zero on the diagonal rules out positive definiteness. 

The saddle-point matrix has m positive and n negative eigenvalues. The energy in (m + 
n)-dimensional space goes upward in m directions and downward in n directions. 

An important computational decision has voters on both sides. Is it better to eliminate 
w and work with one matrix ACA? Optimizers say no, finite element engineers say yes. 
Fluids calculations (with pressure dual to velocity) often look for the saddle point. 

Computational science and engineering is a highly active subject, a mix of software 
and hardware and mathematics in solving ATC A equations with millions of unknowns. 


= REVIEW OF THE KEYIDEAS #® 


1. Row k of A(m by n) tells the start node and the end node of edge k in the graph. 


2. The Laplacian L = ATA has L,; = —1 when an edge connects nodes 7 and j. 

3. The diagonal of L = D — W shows the degrees of the nodes. Each row adds to zero. 
4. With weights c;, on the edges, K = ATCA is the weighted graph Laplacian. 

5. Three steps e = b— Av, w = Ce, f = AT w combine into ATC Av = A'Cb — f. 


Problem Set 7.5 


Problems 1 — 5 are about complete graphs. Every pair of nodes has an edge. 


1 With n = 5 nodes and all edges, find the diagonal entries of ATA (the degrees of 
the nodes). All the off-diagonal entries of AT A are —1. Show the reduced matrix R 
without row 5 and column 5. Node 5 is “grounded” and v5 = 0. 
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10 


11 


12 


Show that the trace of ATA (sum down the diagonal = sum of eigenvalues) 
is n? — n. What is the trace of the reduced (and invertible) matrix R of size n — 1? 


For n = 4, write the 3 by 3 matrix R = (Ayeduced)!(Areduced): Show that 
RR-! =I when R~* has all entries ra off the diagonal and 4 on the diagonal. 


For every n, the reduced matrix R of size n — 1 is invertible. Show that RR'=T 
when R~! has all entries 1/n off the diagonal and 2/n on the diagonal. 


Write the 6 by 3 matrix M = Ajyeduceq When n = 4. The equation Mv = bis to 
be solved by least squares. The vector b is like scores in 6 games between 4 teams 
(team 4 always scores zero; it is grounded). Knowing the inverse of R = M™M, 
what is the least squares ranking 0, for team 1 from solving M?M% = M™b? 


For the tree graph with 4 nodes, A‘ A is in equation (1). What is the 3 by 3 matrix 
R= (A™A),educed? How do we know it is positive definite? 

(a) If you are given the matrix A, how could you reconstruct the graph? 

(b) If you are given L = A™ A, how could you reconstruct the graph (no arrows) ? 


(c) If you are given K = A'CA, how could you reconstruct the weighted graph? 


Find K = ATCA fora line of 3 resistors with conductances c, = 1, co = 4, cg = 9. 
Write Kyeduced and show that this matrix is positive definite. 


A 3 by 3 square grid has n = 9 nodes and m = 12 edges. Number nodes by rows. 


(a) How many nonzeros among the 81 entries of L = A? A? 
(b) Write down the 9 diagonal entries in the degree matrix D: they are not all 4. 
(c) Why does the middle row of L = D — W have four —1’s? Notice L = K2D! 


Suppose all conductances in equation (5) are equal to c. Solve equation (6) for the 
voltages v2 and v3 and find the current J flowing out of node 1 (and into the ground 
at node 4). What is the “system conductance” I/V from node 1 to node 4 ? 


This overall conductance I/V should be larger than the individual conductances c. 


The multiplication A? A can be columns of A? times rows of A. For the tree with 
m = 3 edges and n = 4 nodes, each (column times row) is (4 x 1)(1 x 4) =4 x 4. 
Write down those three column-times-row matrices and add to get L = ATA. 


A graph with two separate 3-node trees is not connected. Write its 6 by 4 incidence 
matrix A. Find two solutions to Av = 0, not just one solution v = (1,1,1,1,1,1). 
To reduce ATA we must ground two nodes and remove two rows and columns. 
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13 “Element matrices” from column times row appear in the finite element method. 
Include the numbers cj, c2, c3 in the element matrices K,, Ky, K3. 


K;, = (row i of A)? (c;) (row i of A) K=A'CA=K,+ K24+ Kz. 


Write the element matrices that add to A™ A in (1) for the 4-node line graph. 


[a 
ATA= [a 


assembly of the nonzero 
| = entries of Kk; + Ko + K3 
K3 | from edges 1, 2, and 3 


14 Ann by n grid has n? nodes. How many edges in this graph? How many interior 
nodes ? How many nonzeros in A and in L = A? A? There are no zeros in L~!! 


15 When only e = C~!w is eliminated from the 3-step framework, equation (9) shows 


Saddle-point matrix Cc A wi] _|ob 
Not positive definite AT 0 ae ee ae 


Multiply the first block row by ATC and subtract from the second block row: 


PeEROe ce A wl) b 
After block elimination eee tar | | ks | = | f- ATCO |. 


After m positive pivots from C~!, why does this matrix have negative pivots? 
The two-field problem for w and v is finding a saddle point, not a minimum. 


16 The least squares equation A'Av = A™b comes from the projection equation 
Ate = 0 for the errore = b— Av. Write those two equations in the symmetric 
saddle point form of Problem 15 (with f = 0). 


In this case w = e because the weighting matrix is C' = I. 


17 ‘Find the three eigenvalues and three pivots and the determinant of this saddle point 
matrix with C = J. One eigenvalue is negative because A has one column: 


| 1 Aogeee Oaee 
m= 27H | at ‘a O 1 1 
; -1 1 0 
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= CHAPTER 7 NOTES #8 


Polar Form of an Invertible Matrix: A = QS = (orthogonal) (positive definite). 
This is like re*® for complex numbers (1 by 1 matrices). |e’®| = 1 is the orthogonal Q 
and r > 0 is the positive definite S. The matrix factors come directly from the Singular 
Value Decomposition of A: 


A =UXV" = (UV") (VEV") = (orthogonal) times (positive definite). 


When A is invertible, so is ©. Then oj to co, are the (positive) eigenvalues of VxvV!. 
In physical language, every motion combines a rotation/reflection @ with a stretching S. 


Transpose of A=d/dz. It is not enough to say that “the transpose is —d/dzx.” 
The boundary conditions on the functions f and g in Af = df/dz and A'g = —dg/dzx 
are important parts of A and AT. In Section 7.3 and especially Problem 1, A comes 
with two conditions f(0) = 0 and f(1) = 0. Then AT = —d/dz has no conditions on 
g. What we want is (Af, g) = (f, A’g). 

Integration by parts is like transposing the operator d/dz. The integrated term 
fg is safely zero when f(0) = f(1) = 0. The fixed-free operator d/dx with only one 
condition f(0) = 0 would transpose to the free-fixed operator —d/dzx with the other 
condition g(1) = 0. Then the integrated term is again fg = 0 at both ends. In each case, 
boundary conditions on g make up for missing boundary conditions on f. 


Principal Component Analysis (PCA): Find the most significant (least random) data. 


Data often comes in rectangular matrices: A grade for each student in each course. 
Activity of each gene in each disease. Sales of each product in each store. Income in each 
age group in each city. An entry goes into each column and each row of the data matrix. 

By subtracting off the means, we study the variances: measures of useful information 
as opposed to randomness. The SVD of the data matrix A (showing the eigenvectors and 
eigenvalues of the correlation ATA) displays the principal component: the largest piece 
O1UvP of the matrix. The orthogonal pieces o;Uive are in order of importance. The 
largest o is the most significant. From a large matrix of partly random data, PCA and the 
SVD extract its most significant information. 

Wikipedia lists many methods that are identical or closely related to PCA. The crucial 
singular vector v1 (which has ATAv, = XmaxV1) is also the vector that maximizes 
the Rayleigh quotient (v? AT Av)/vT v. Computing the first few singular vectors does not 
require the whole SVD ! 


Chapter 8 


Fourier and Laplace Transforms 


This book began with linear differential equations. It will end that way. Those are the equa- 
tions we can understand and solve—especially when the coefficients are constant. 
Even the heat equation and wave equation (those are PDE’s) have good solutions. 


These are extremely nice problems, no apologies for that. Almost every application 
starts with a linear response—current proportional to voltage, output proportional to input. 
For large voltages or large forces, the true law may become nonlinear. Even then, we 
often use a sequence of linear problems to deal with nonlinearity. The constant coefficient 
linear equation is the one we can solve. 


This chapter introduces Fourier transforms and Laplace transforms. They express ev- 
ery input f(x) and f(t) and every output y(x) and y(t) as a combination of exponentials. 
For each exponential, the output multiplies the input by a constant that depends on 
the frequency: y(t) = Y(s)e** or Y(w)e*’*. That transfer function describes 
the system by its frequency response: the constants Y that multiply exponentials. 


We have used the complex gain 1/(iw — a) to invert y’ — ay, along with transfer 
functions in Chapters 1 and 2. Now we see them for every time-invariant and shift-invariant 
partial differential equation—with coefficients that are constant in time and space. 


Naturally those ideas appear again for discrete problems with matrix equations. The 
matrices may be approximating derivatives (like the —1,2,—1 second difference matrix). 
Or they come on their own from convolutions. Their eigenvectors will be discrete sines 
or cosines or complex exponentials. A combination of those eigenvectors is a discrete 
Fourier series (DFT). We find the coefficients in that combination by using the Fast Fourier 
Transform (FFT)— the most important algorithm in modern applied mathematics. 


A note about sines and cosines versus complex exponentials. For real problems we 
may like sines and cosines. But they aren’t perfect. We keep cos 0 and we don’t keep sin 0. 
We want one of the highest frequency vectors (1, —1,1, —1,...) and (—1,1, —1,1,...) but 
not both. In the end (and almost always for the FFT) the complex exponentials win. 
After all, they are eigenfunctions of the derivative d/dx. Transforms are based on 
combinations of those exponentials—and the derivative of e“® is just iwe™*. 
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This page describes a specially nice function space. It is called “Hilbert space.” 
The functions have dot products and lengths. There are angles between functions, 
so two functions can be orthogonal (perpendicular). The functions in Hilbert space are 
just like vectors. In fact they are vectors—but Hilbert space is infinite-dimensional. 

Here are parallels between real vectors f = (f1,..., fn) and real functions f(x). 
Physicists even separate < f| (bra) from |g > (ket). Not here! 


Inner product fa =fig te +jnen <f,.9 > =) Fegan 


Length squared ||f\|?=f7F=S1A2 — UWIP=<f.f>= f (f@Par 
Angle 0 cos 6 = f'g/I|f\\llgl| cos0 =< f,g > /Ilfiiligll 
Orthogonality f'g=0 <fig>= i f(x) g(a)dz =0 


A function is allowed into Hilbert space if it has a finite length: [|f(a)|?dz < o. 
Thus f(x) = 1/ax and f(x) = 6(x) do not belong to Hilbert space. But a step function 
is good. And the function can even blow up at a point—just not too fast. For example 
f(x) = 1/|x|*/4 belongs to Hilbert space and its length is || f || = 2701/4: 


as 


f(0) is infinite but || f||? = i |x|~1/?da = 4 er? |, =4An/?, 


When |f(a)| = |f(—2)|, the integral from —7 to 7 is twice the integral from 0 to z. 


There is always an adjustment for complex vectors and functions : 


—T —= — a 
Inner product f g=fi91+---+fNngn <i gi> = [ F@9(e)ax 


Orthogonality is still < f,g >= 0. The best examples are the complex exponentials: 


x ci 


i(n—k)x 
i i thn ene e 
eike and e’”” are orthogonal ferris = ; 
m—k 


I 
2 


=i 
—T 


Those e*** are an orthogonal basis for Hilbert space. Instead of xyz axes, functions 
need infinitely many axes. Every f(x) is a combination of the basis vectors e*** : 


ia Bit _ p—3ie a ’ fy Shel 
f(a) = SS 4} Eas fire =m(P +P + tote). 


This particular f(x) happens to be a step function. To Hilbert, step functions are vectors. 
Then Fourier “transformed” f (x) into the numbers (like 1 and 4) that multiply each e**”. 
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8.1 Fourier Series 


This section explains three Fourier series: sines, cosines, and exponentials e***. 
Square waves (1 or 0 or —1) are great examples, with delta functions in the derivative. 
We look at a spike, a step function, and a ramp—and smoother functions too. 

Start with sin x. It has period 27 since sin(x + 27) = sin z. It is an odd function since 
sin(—x) = —sinz, and it vanishes at x = 0 and z = az. Every function sin nz has those 
three properties, and Fourier looked at infinite combinations of the sines: 


Fourier sine series 9 S(x) = 6b) sinz + bosin 2x7 + bgsin3¢+---= MS b,sinnz (1) 


n=1 
If the numbers 0, b2,b3,... drop off quickly enough (we are foreshadowing the 
importance of their decay rate) then the sum $(z) will inherit all three properties: 
Periodic S(x + 27) = S(z) Odd S(—x) = —S(z) S(0) = S(7) =0 


200 years ago, Fourier startled the mathematicians in France by suggesting that any odd 
periodic function S(x) could be expressed as an infinite series of sines. This idea started 
an enormous development of Fourier series. Our first step is to find the number 6;, that 
multiplies sin kx. The function S() is “transformed” to a sequence of b’s. 


Suppose S(x) = >> by, sinnax. Multiply both sides by sin kx. Integrate from 0 to 7: 
i S(x)sinkede = [ bsincsinkr det. | b, sinkx sinkxdx+--- (2) 
0 0 0 


On the right side, all integrals are zero except the highlighted one with n = k. This 
property of “orthogonality” will dominate the whole chapter. For sines, integral = 0 is a 
fact of calculus: 


Tw 
Sines are orthogonal i: sinnz sinkrdxr=0 if nk. (3) 
0 


Zero comes quickly if we integrate [cos ma dx = ["2*]" = 0 — 0. So we use this: 


1 1 
Product of sines sinnz sinkx = 5 cos(n — k)ax 5 cos(n+k)x. (4) 


Integrating cos (n — k)x and cos (n + k)ax gives zero, proving orthogonality of the sines. 
The exception is when n = k. Then we are integrating (sin kx)? = 5 — 4 cos 2ka: 


2 


a mal Aiea T 
a sin kx sinke de = [ sae — [ —cos2kx dx = —. (5) 
0 0 2 o 2 2 


The highlighted term in equation (2) is (7 /2)b,z. Multiply both sides by 2/7 to find by. 


8.1. Fourier Series 435 
Sine coefficients Pay hes IRS ee ; 
SCH ie) bp = ih S(a) sin ka dx = = e S(z) sin ka dz. (6) 


Notice that S(x) sin kx is even (equal integrals from —7 to 0 and from 0 to 7). 
I will go immediately to the most important example of a Fourier sine series. 


S(x) is an odd square wave with SW(x) = 1 for0 < x < a. It is drawn in 
Figure 8.1 as an odd function (with period 277) that vanishes at x = 0 and x = 7. 
SW(x) =1 


27 


Figure 8.1: The odd square wave with SW (x + 27) = SW(x) = {lor 0or-—1}. 


Example 1 Find the Fourier sine coefficients b;, of the odd square wave SW (x). 


Solution Fork = 1,2,... use formula (6) with S(x) = 1 between 0 and 7: 


a = si 2 2 2 
bp = -/ sinkr dz = Z — -2{3, = = -s) ges) a) 
0 TT 


T k 6 T 


The even-numbered coefficients b2, are all zero because cos2ka7 = cosO0 = 1. The odd- 
numbered coefficients b, = 4/7k decrease at the rate 1/k. We will see that same 1/k decay 
rate for all functions formed from smooth pieces and jumps. 


Put those coefficients 4/7k and zero into the Fourier sine series for SW (z): 


sin x2 sin3x sindx  sin7x 


(8) 


4 
Square wave SW(a) = aa 
1 


Figure 8.2 graphs this sum after one term, then two terms, and then five terms. You can see 
the all-important Gibbs phenomenon appearing as these “partial sums” include more terms. 
Away from the jumps, we safely approach SW(x) = 1 or —1. At x = 7/2, the series gives 
a beautiful alternating formula for the number 7 : 

1 Ab Dak. h =< E wk 9 
—alt SB OF ae ES (5 315 ete). ©) 
The Gibbs phenomenon is the overshoot that moves closer and closer to the jumps. 

Its height approaches 1.18... and it does not decrease with more terms of the series. 
This overshoot is the one greatest obstacle to calculation of all discontinuous functions 
(like shock waves). We try hard to avoid Gibbs but sometimes we can’t. 
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1 3 


Solid curve eS (= mse cam **) 
T 


4 /sinz sin 9x 
5 terms: — {| —— +--- + — 

7 1 9 
aches 4 sin x Gibbs overshoot —> 

T 


Figure 8.2: The sums b; sina + --- + by sin Nz overshoot the square wave near jumps. 


Fourier Cosine Series 


The cosine series applies to even functions C(x) = C(—«x). They are symmetric across 0: 


[> <) 
Cosine series C(x) = a9 +a; cosx + a2 cos 24 +---=ao+ oS Qncosnx. (10) 
n=l 
Every cosine has period 27. Figure 8.3 shows two even functions, the repeating ramp 
RR(z) and the up-down train U D(x) of delta functions. That sawtooth ramp RR is the 
integral of the square wave. The delta functions in U D give the derivative of the square wave. 
(For sines, the integral and derivative are cosines.) RR and UD will be valuable examples, 
one smoother than SW, one less smooth. 
First we find formulas for the cosine coefficients ag and ay. The constant term ao is 

the average value of the function C(x) : 


Qo = average as CG G : (11) 


I just integrated every term in the cosine series (10) from 0 to 7. On the right side, the 
integral of ag is aga (divide both sides by 7). All other integrals are zero: 


a sinnaz ]” 
) cosna dx = | | =0-0=0. (12) 
0 TEP iD 


In words, the constant function 1 is orthogonal to cos nx over the interval [0, 7]. 
The other cosine coefficients a, come from the orthogonality of cosines. As with sines, 
we multiply both sides of (10) by cos ka and integrate from 0 to 7: 


wT 


aT Tv Tv 
i C(x) cos ka dx -/ ag cos kx da+ | a, cos x cos kx de+-+ [ ax(cos kx)? dx+-- 
0 0 0 0 


You know what is coming. On the right side, only the highlighted term can be nonzero. For 
k > 0, that bold nonzero term is a,7 /2. Multiply both sides by 2/7 to find a, : 


rae vae ap = =| C(x) cos ka dx = e C(a)coska dx. (13) 


T Jan 
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26(x) 26(x — 27) 


Up-down U D(x) 


—T 0 7 2a 
Derivative of | Square Wave 


> x 


Repeating Ramp RR(z) 
Integral of Square Wave —26(x + 7) —26(% — 7) 


Figure 8.3: The repeating ramp RRA and the up-down UD (periodic spikes) are even. 
The slope of RR is —1 then 1: odd square wave SW. The next derivative is UD: + 26. 


Example 2 Find the cosine coefficients of the ramp RR(x) and the up-down U D(z). 
Solution The simplest way is to start with the sine series for the square wave : 


4 {sing sin3x sindx ~~ sin7x 
SW (x) = — ae 3 + 5 + 7 +.---]| =slope of RR 


Take the derivative of every term to produce cosines in the up-down delta function: 


4 
Up-down spikes UD(2) = - [cos z + cos 3x + cos5x + cos7x +--:]. (14) 


Those coefficients don’t decay at all. The terms in the series don’t approach zero, so 
officially the series cannot converge. Nevertheless it is correct and important. At z = 0, 
the cosines are all 1 and their sum is +oo. At x = 7, the cosines are all —1. Then 
their sum is —co. (The downward spike is —26(a — 7).) The true way to recognize 6(z) 
is by the integral test [ 5(x) f(x) dx = f (0) and Example 3 will do this. 

For the repeating ramp, we integrate the square wave series for SW (x) and add ao. 
The average ramp height is ag = 7/2, halfway from 0 to 7: 


mw {|cosx cos3x  cosdxz  cos7x 


74 era a (es 


Ramp series RR(x) = ; = 


The constant of integration is ag. Those coefficients ay, drop off like 1/k?. They could 
be computed directly from formula (13) using | x cos kz dz, and integration by parts (or 
an appeal to Mathematica or Maple). It was much easier to integrate every sine separately 
in SW (a), which makes clear the crucial point: Each “degree of smoothness” in the 
function brings a faster decay rate of its Fourier coefficients a, and bx. 
Every integration divides those numbers by k. 


No decay Delta functions (with spikes) 
1/k decay Step functions (with jumps) 
1/k? decay Ramp functions (with corners) 
1/k* decay Spline functions (jumps in f”’) 


r® decay with r <1 Analytic functions like 1/(2 — cos x) 


438 Chapter 8. Fourier and Laplace Transforms 


The Fourier Series for a Delta Function 
Example 3 _ Find the (cosine) coefficients of the delta function 6(a), made 27-periodic. 
Solution The spike in d(x) occurs at ¢ = 0. All the integrals are 1, because the 


cosine of 0 is 1. We divide by 27 for ag and by 7 for the other cosine coefficients ax. 


1 a 1 : T apt 1 
Average ao = — d(x)dz =— ~—_ Cosines a, = — 6(a) cos ka dx = — 
QT pee 20 TM if Si. T 


Then the series for the delta function has all cosines in equal amounts : No decay. 
: te 
Delta function 6(a) = ee + —[cosx + cos 2x + cos3x2+---]. (16) 
Te ae 


This series cannot truly converge (its terms don’t approach zero). But we can graph the sum 
after cos 5a and after cos 10x. Figure 8.4 shows how these “partial sums” are doing their 
best to approach 6(z). They oscillate faster while going higher. 


There is a neat formula for the sum 6, that stops at cos Nx. Start by writing each term 
2cos x as e*” + e~*”, We get a geometric progression from e~*” up to e**. 


ik : : : 1 sin(N + 4)x 
n= — (1 fs a is | eiNe er | oe esd le : 3) : (17) 
Qr 20 sin 52 


This is the function graphed in Figure 8.4. 


d10(x) f height 21/27 


55 (x) height 11/27 


height 1/27 
~ height —1/27 


Figure 8.4: The sums dy (xz) = (1+ 2cosz +---+2cos Nx)/2z try to approach 6(z). 
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Complete Series: Sines and Cosines 


Over the half-period [0,7], the sines are not orthogonal to all the cosines. In fact the 
integral of sin x times 1 is not zero. So for functions F'(x) that are not odd or even, we must 
move to the complete series (sines plus cosines) on the full interval. Since our functions 
are periodic, that “full interval” can be [—7, 7] or [0, 277]. We have both a’s and b’s. 


Complete Fourier series F(x) = ag + Ss Qn cosnz + os b, sinnz . (18) 


n=1 n=1 


On every “27 interval” the sines and cosines are orthogonal. We find the Fourier 
coefficients a, and b, in the usual way: Multiply (18) by 1 and coskz and sinkz. 
Then integrate both sides from —7 to 7 to get ao and a; and b,. 


Tv 


aus 
F(x)coskadx bp -=/ F(x) sin ka dx 
—1 


Orthogonality kills off infinitely many integrals and leaves only the one we want. 
Another approach is to split F(a) = C(a) + S() into an even part and an odd part. 
Then we can use the earlier cosine and sine formulas. The two parts are 


PLC yoh ae) 
2 


C2) = Faven(2) = S(e) = Foaa(e) = “SI=FCI aay) 


The even part gives the a’s and the odd part gives the b’s. Test on a square pulse from 
x =0 to x = h—this one-sided thin box function is not odd or even. 


: : ADS = _f l/h for0<a2<h 
Example 4 Find the a’s and b’s if F(x) = tall box = { 0 ee eae 


Solution The integrals for ag and a, and by stop at x = h where F(x) drops to zero. 
The coefficients decay like 1/k because of the jump at x = 0 and the drop at x = h: 


i ae ca 1. 
Coefficients of square pulse ao = — i, 1/hdx = — = average 
27 Jo 27 


1 bs sinkh — cos ke 
an = — coska dx = ——— mot ee a 
Th Jo awkh 


Important As h approaches zero, the box gets thinner and taller. Its width is h and its 
height is 1/h and its area is 1. The box approaches a delta function! And its Fourier 
coefficients approach the coefficients of the delta function as h — 0: 


ee a, = Kh " 1 b, = Lacos kh 
ao = — approaches = k= ee 


5 kh approaches 0. (20) 
w T 


440 Chapter 8. Fourier and Laplace Transforms 


Energy in Function = Energy in Coefficients 


There is an extremely important equation (the energy identity) that comes from integrat- 
ing (F(a))?. When we square the Fourier series of F(x), and integrate from —7 to 7, 
all the “cross terms” drop out. The only nonzero integrals come from 1? and cos? kx 
and sin? ka. Those integrals give 27 and 7 and 7, multiplied by a2 and a? and b? : 


Ts 


Energy | (F(a))?da = 27a? + m(a? + b? + a2 +62 +---). (21) 


-_—T 


The energy in F(x) equals the energy in the coefficients. The left side is like the length 
squared of a vector, except the vector is a function. The right side comes from an infinitely 
long vector of a’s and b’s. The lengths are equal, which says that the Fourier transform 
from function to vector is like an orthogonal matrix. Normalized by J2n and VT, 
sines and cosines are an orthonormal basis in function space. 


Complex Exponentials cpethe 


This is a small step and we have to take it. In place of separate formulas for ag and a, 
and by, we will have one formula for all the complex coefficients c,. And the function 
F(x) might be complex (as in quantum mechanics). The Discrete Fourier Transform will 
be much simpler when we use N complex exponentials for a vector. 

We practice with the complex infinite series for a 27-periodic function: 


Complex Fourier series F(x) = co + cie™ +c_1e"* +--+ = D> cne’™® (22) 


n=—oo 


If every cy, = C_n, we can combine e*”* with e~’”” into 2cosnz. Then (22) is the 
cosine series for an even function. If every c, = —c_n, we use e*”* — e~'"™* = 2isinnz. 
Then (22) is the sine series for an odd function and the c’s are pure imaginary. 


To find cz, multiply (22) by e~*** (not e***) and integrate from —z to 7: 


Tv TT Tv as 
i, F(z)e—*** dx = i coe” ** dx + i cye*e—*@ dy 4 .. +f epee dg Bs 
—T =: = = 


The complex exponentials are orthogonal. Every integral on the right side is zero, 
except for the highlighted term (when n = k and e***e~*** = 1). The integral of 1 is 27. 
That surviving term gives the formula for cx: 


us 
Fourier coefficients / F(x)e—** dx =2rc, for k=0,+1,...1 (3) 
—-T 


Notice that co = ao is still the average of F(x). The orthogonality of e*”* and e’** is 
checked by integrating e*”* times e~***. Remember to use that complex conjugate e~***, 
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Example 5 For a delta function, all integrals are 1 and every c;, is 1/27. Flat transform! 


1 fors<xax<s+th 


Example 6 Find cx for the 27-periodic shifted box F'( x)= { 0 elsewhere in [—7,7] 


Solution The integrals (23) have F = 1 fromstos+h: 


1 poth 1 fenike sth ; 1 — en ikh 
=— 1+ ek? dy = — | —_ =e7tks (_—-___ | (24 
sa pre tS TES On ik I 3 ( Qik ) Co) 


—iks 


Notice above all the simple effect of the shift by s. /t “modulates” each cj, by e 
The energy is unchanged, the integral of | F'|? just shifts, and |e~***| = 1. 


Shift F(z) to F(a —s) <> Multiplyevery c, by e~**®. (25) 


Example 7 A centered box has shift s = —h/2. It becomes balanced around z = 0. 
This even function equals 1 on the interval from —h/2 to h/2: 


eikh/2 l—e*** 1 sin(kh/2) 


h 
Centered by s = —— = - 
sis la 2 . Qrik Qn k/2 


Divide by h for a tall box. The ratio of sin( kh/2\o kh/2 is called the “sine” of kh/2. 


Feentered 1 - . kh ike 1/h for — h/2 Ss Hs h/2 
Talhbox aa es X sme \ a J © =) 0 elsewhere in [—7, 7] 
That division by h produces area = 1. Every coefficient approaches oad ash — 0. 


The Fourier series for the tall thin box again approaches the Fourier series for 5(x). 


The Rules for Derivatives and Integrals 


The derivative of e’** is ike’**. This great fact puts the Fourier functions e’** in first 
place for applications. They are eigenfunctions for d/dx (and the eigenvalues are X = ik). 


Differential equations with constant coefficients are naturally solved by Fourier series. 
Multiply by ik The derivative of F(x) = )_ cxe‘*® is dF/dx = ) - ikcxe*** 


The second derivative has coefficients ( ik}c, = —k*c,. High frequencies are growing 
stronger. And in the opposite direction (when we integrate), we divide by 7k and high 
frequencies get weaker. The solution becomes smoother. Please look at this example: 


Response 1/(k? + 1) ss ae ae epee 
to frequency & “ea wee ey ee ay 


This was a typical problem in Chapter 2. The transfer function is 1/( # + 1). There we 
learned : The forcing function e*** is exponential so the solution is exponential. 
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All we are doing now is superposition. Allow all the exponentials at once ! 


d?y 
dx? 


chek 
k? +17 


+y= Dy cpe** issolvedby y(x) = % 


(26) 


1. Derivative rule dF’ /dz has Fourier coefficients ikc;, (energy moves to high k). 


2. Shift rule F(a — s) has Fourier coefficients e~*** c;,, (no change in energy). 


Application: Laplace’s Equation in a Circle 


Our first application is to Laplace’s equation uzz + Uyy = 0 (Section 7.4). The idea is 
to construct u(z,y) as an infinite series, choosing its coefficients to match uo(z, y) 
along the boundary. The shape of the boundary is crucial, and we take a circle of radius 1. 

Begin with the solutions 1, rcos0, rsin@, r? cos20, r*sin20, ... to Laplace’s 
equation. Combinations of these special solutions give all solutions in the circle: 


u(r,9) = ao + aircos@ + birsin@ + azr? cos 20 + ber? sin20 +--+ (27) 


It remains to choose the constants a, and b;, to make u = ug on the boundary. For a circle, 
6 and 6 + 27 give the same point. This means that wo(6) is periodic: 


Setr=1 uo(0) = ap + a1 cos@ + b; sin@ + ag cos 20 + by sin 26 + --- (28) 


This is exactly the Fourier series for uo. The constants a; and b;, must be the Fourier 
coefficients of wo(@). Thus Laplace’s boundary value problem is completely solved, if 
an infinite series (27) is acceptable as the solution. 


Example 8 Point source up = 6(0). The boundary is held at uo = 0, except for the 
source at x = 1, y = 0 (where 6 = 0). Find the temperature u(r, @) inside the circle. 


1 1 oe : 
Delta function u9(@) = ois —(cos @ + cos 20 + cos30 +--+) = ; Siew 
7 7 1 


Inside the circle, each cos n@ is multiplied by r” to solve Laplace’s equation: 
1 il 
Inside the circle u(r, @) = = + —(rcosé +r? cos20 + r? cos 30 +--+) (29) 
7 1 


Poisson managed to sum this infinite series! It involves a series of powers (re’’)”. 
His sum gives the response at every (r, 0) to the point source at r = 1, 0 = 0: 


1 1—r? 


Temperature inside circle u(r, 0) = oy 
a = 


(30) 


At the center r = 0, this produces the average of uo = 6(@) which is ag = 1/27. 
On the boundary r = 1, this gives u = 0 except u = ov at the point where cos 0 = 1. 
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Example 9 wo(@) = 1 on the top half of the circle and wo = —1 on the bottom half. 


Solution The boundary values uo are a square wave SW. We know its sine series: 


(31) 


sin @ sin : sin 5 
Square wave for uo(9) SW(0) = 2 Ee = ne ake dis | 
W 3) 


Inside the circle, multiplying by r, r?, r°, ... gives fast decay of high frequencies : 


(32) 


4 in 6 3 sin 30 > sin 50 
Rapid decay inside u(r,@) = = [a E ~~ 3 ae +] 


Laplace’s equation has smooth solutions inside, even when u(9) is not smooth. 
Problem Set 8.1 


1 (a) To prove that cos nx is orthogonal to cos kx when k # n, use the formula 
(cosnx) (coskx) = $cos(n + k)x + $cos(n — k)a. Integrate from z = 0 to 
x =. What is [ cos” kx dr? 


(b) From 0 to z, cos z is not orthogonal to sin x. The period has to be 27: 


T 7 2a 


Find / (sina) (cosa) dx and [ (sna) (cosa) daz and [(inz) (cos x) da. 
0 —T 0 
2 Suppose F(x) = x for0 < x < a. Draw graphs for —27 < x < 27 to show 
three extensions of F': a 27-periodic even function and a 27-periodic odd function 
and a 7-periodic function. 


3 Find the Fourier series on —7 < x < a for 


(a) fi(z) = sin? z, an odd function (sine series, only two terms) 
(b) fo(x) = | sin x], an even function (cosine series) 


(c) f3(x) = x for —7 < x < = (sine series with jump at x = 77) 


4 Find the complex Fourier series e” = yoo on the interval —7 < x < 7. 
The even part of a function is 3(f (x) + f(—2)), so that feven(2) = feven(—a). Find 
the cosine series for feyen and the sine series for fog. Notice the jump at x = 7. 


5 From the energy formula (21), the square wave sine coefficients satisfy 


Substitute the numbers b; from equation (8) to find that 7? = 8(1 + 1 =f 35 +--+), 
6 If a square pulse is centered at x = 0 to give 
T T 
f(v)=1 for |r| < 5° f(x) =0 for Saas lal <7, 


draw its graph and find its Fourier coefficients ay, and b,x. 
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Plot the first three partial sums and the function x(7 — 2): 


8 (= x sin3x  sindz 


1 7 195 te) O<ece 
Why is 1/k? the decay rate for this function? What is its second derivative? 


Sketch the 27-periodic half wave with f(x) = sina forO < x < mand f(x) = 0 
for —7 < x < 0. Find its Fourier series. 


Suppose G(x) has period 2L instead of 27. Then G(x + 2L) = G(x). Integrals 
go from —L to L or from 0 to 2L. The Fourier formulas change by a factor 7/L: 


L 
2° P 1 ; 
The coefficients in G(a) = S> Cye*™*/4 are Cy = 7 [Glen treltae. 
ae i 
Derive this formula for C;,: Multiply the first equation for G(x) by and 
integrate both sides. Why is the integral on the right side equal to 2ZC;, ? 


For Geyen, use Problem 9 to find the cosine coefficient A; from (Cy + C_x)/2: 
v5 
oo k 1 k 
Geven(z) = >> Ax cos ee Thgs Ap=r [ Geven(x) cos de. 
0 iG Te: L 
0 
Geven is $(G(z) + G(—2)). Exception for Ag = Co: Divide by 2L instead of L. 


1 
Problem 10 tells us that a, = —(cze +cC_»%) on the usual interval from 0 to 7. 
Find a similar formula for b, from cy, and c_x. In the reverse direction, find the 
complex coefficient c, in F(x) = )~> cye*** from the real coefficients a; and bx. 


Find the solution to Laplace’s equation with uo = 0 on the boundary. Why is this the 
imaginary part of 2(z — z?/2 + z3/3---) = 2log(1 + z)? Confirm that on the unit 
circle z = e”, the imaginary part of 2 log(1 + z) agrees with 0. 


If the boundary condition for Laplace’s equation is uy = 1 for0 < @ < 7 and 
uo = 0 for —m < @ < O, find the Fourier series solution u(r, @) inside the unit circle. 
What is u at the originr = 0? 


With boundary values uo(@) = 1+ se" + je”? +--+, what is the Fourier series 
solution to Laplace’s equation in the circle? Sum this geometric series. 


(a) Verify that the fraction in Poisson’s formula (30) satisfies Laplace’s equation. 


(b) Find the response u(r, ) to an impulse at 2 = 0, y = 1 (where 6 = 4). 


With complex exponentials in F(x) = S~ cxe***, the energy identity (21) changes to 
J |F(a)|? de = 27> |cx|?. Derive this by integrating (>> cpe**”)(S> Ge **). 
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A centered square wave has F'(x) = 1 for |x| < 7/2. 


(a) Find its energy [ |F(x)|? dz by direct integration 
(b) Compute its Fourier coefficients c, as specific numbers 
(c) Find the sum in the energy identity (Problem 16). 
F(x) = 1+ (cosx)/2+---+ (cosnx)/2”" +--+ is analytic: infinitely smooth. 


(a) If you take 10 derivatives, what is the Fourier series of d!°F'/dx1°? 


(b) Does that series still converge quickly ? Compare n!° with 2” for n = 21°, 


If f(x) = 1 for |z| < a/2 and f(x) = 0 for 7/2 < |z| < 7, find its cosine 
coefficients. Can you graph and compute the Gibbs overshoot at the jumps ? 


Find all the coefficients a; and b; for F, /, and D on the interval -7 < x < 7: 


F(x) =6(2- =) (2) = [5 (e- 5) dx D(a) = <6 (e-5). 


For the one-sided tall box function in Example 4, with F = 1/h forO < 2 < h, 
what is its odd part $(F'(«x) — F(—«))? I am surprised that the Fourier coefficients 
of this odd part disappear as h approaches zero and F(x) approaches 6(«). 


Find the series F(x) = >> cpe*** for F(x) = e? on —m < x < 7m. That function 
e” looks smooth, but there must be a hidden jump to get coefficients c, proportional 
to 1/k. Where is the jump? 

(a) (Old particularsolution) Solve Ay” + By! + Cy = e***. 

(b) (New particular solution) Solve Ay” + By’ + Cy = >> cpet**. 
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8.2 The Fast Fourier Transform 


Fourier series apply to functions. But we compute with vectors. We need to replace the 
infinite sequence of coefficients c, (or ax and b,) by a finite sequence co, c1,...,CN—1.- 
We want to preserve and use orthogonality, so the computations will be fast. For the 
Discrete Fourier Transform, you will see how the FFT makes the computations extra fast. 


This section describes two separate ideas. The DFT provides formulas for the c’s. 
The FFT is an amazing algorithm to compute the c’s by rearranging those formulas. 


Discrete Fourier Transform (DFT) 


The DFT chooses N orthogonal basis vectors €9 to ex; for N-dimensional space. 
The vector e, comes from e***, by sampling that function at N points spaced by 27/N : 


Basis vector ex ( ik0 ,ik2n/N ,ik4n/N k . 2k 
Discrete e**” ferent ove An Ae 220) Wiha =e 


i2n/N 
The continuous Fourier series is )~ c,e“**. The discrete Fourier series is > c,ex. That 
sum is a multiplication f = F'c with the symmetric N by N Fourier matrix F’. The basis 
vectors ex go into the columns of F’. 

The matrix F' containing powers of w is shown in detail in equation (4). 


| | co 
Fourier matrix Br Bek ce : 
pa he f=ceot+ame.t+::-= 0 Nk ; (1) 


| | eat 


Inverting f = Fe gives c = F~!f. The continuous case produced e~*** in the Fourier 
coefficient formula c, = fe ** f(x)dx/2m. The discrete case produces powers of 


™ = e~?7/N in the inverse matrix. Those powers of w are displayed in equation (3). 
—— a fo 
Inverse matrix 1 : ; 1 —T 
any ha 3 aaa; : ; =a f. (2) 
— ep. ey == f N-1 


The constant vector €g9 = (1, 1,..., 1) has |Jeo||? = 1+1+---+1 = N. Every basis vector 
has ||ex,||* = N instead of f |e"**|?dax = 27. 

Please notice that F'-! produces the coefficients c, from the vector f: the Fourier 
transform. The Fourier matrix F' reconstructs f from the c’s (the inverse transform). 
The entries of F~! are like e~*** and the entries of F are like e***. Thus F~! = F/N 
contains powers of w = e~*27/" , while F contains powers of w = e??7/N, 

The MATLAB command c = fft(f) uses W and the inverse Fourier matrix F'~'. 
The opposite command f = ifft(c) adds up the N-term series Fc to reconstruct f in (1). 
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Example 1 The delta vector f = (1,0,0,...) is like a delta function 5(x). The Fourier 
coefficients of a delta function are all equal to c, = 1/27. The discrete coefficients of 
a delta vector are all equal to c, = 1/N. The transform of f is a constant vector. 


> pal il 1 1 
; = 1/1 @ rae 0 be 
Fourier transform F-1f =c Wii @w 2-1) 0o|~ Nl1 (3) 


Example 2 The shifted vector f = (0, 1,0, ...) is like a shifted delta function 6(2 — 24). 


The shifted vector f picks out the next column (1,W,@”,...) of F~! in equation (3). 
The shifted delta function chooses the (same) values of c, = e~*** atx = Qn/N. 
The only difference between those discrete and continuous c’s is dividing by N or 27. 


Example 3 The constant vector c = (1, 1,...)/N transforms back to the delta vector ! 


1 1 1 
ae N-1 1 
Fourier matrix Fc = f 1 Re ~ o 2 nt) 114.1 > ; (4) 


That equation says that N — 1 basis vectors starting with (1, w, w”, .. .) are orthogonal to the 
first vector (1,1,...,1). The basis vectors e;, in the columns of F are orthogonal. 
After a few words about the FFT, equation (7) will confirm this orthogonality. 


Fast Fourier Transform (FFT) 


The FFT is a brilliant rearrangement of those matrix-vector multiplications f = Fc and 
c = F7'f. Normally, multiplying a vector by an N by N matrix takes N? separate 
multiplications. (Each entry in the square matrix is used once. There are N? entries.) 
The FFT computes c and f with only 3N log, N separate multiplications. 

For size N = 1024 = 2!°, the logarithm is 10. In this case N? (a million steps) are 
reduced to 5N (five thousand steps). The transform is speeded up by a factor near 200, 
which is truly astonishing. 

In my opinion, the FFT is the most important algorithm in computational science. 
It has transformed whole industries. When your instruments measure the response to 
an input (like the pressure in an oil well), the DFT shows the response to each frequency. 
The FFT computes N numbers from N numbers, very fast. 


The Basis Vectors e Rin the Fourier Matrix F’ 


A crucial point is that the basis vectors e€9,...,€n—1 are orthogonal. Those vectors are 
complex, just as the functions e’** are complex. So their inner products Epen require the 
complex conjugate of one vector, just like [ e’”* eat 
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Here is a typical basis vector e,, followed by the Fourier matrix that contains 
€0, €1,---,;€N_—1 inits columns: 


a} 1 earl ss J 
e2rik/N w* een wn-) 
e, = eftik/N = wk a 1 w2 oe we -) (5) 
1 gNal = Bea apes 


The number w is e?7*/. We use the Greek letter w for its conjugate @ = e~27/N = w, 


It is the properties of 1, w, w?, ... that make the basis vectors (columns of F’) orthogonal. 
Our first step is to locate w and w in the complex plane. In fact we can locate all the 
powers of w up to wN = (e27/N)N = ¢?™* = 1. For N = 8, the powers of w produce 
8 points evenly spaced around the unit circle . Notice that w® = 1. 

For N = 4, the four powers will be 2, i? = —1, 72 = —i, andi* = 1. 


cae en 2/8 = 


Figure 8.5: The eight powers of w = cos - +isin me The polar form w = e?7*/8 is best. 


Orthogonality of the Discrete Fourier Basis 


The key to good formulas for the Fourier coefficients c, is orthogonality. That property 
removes every term except term k, when we take a dot product with the basis vector e; : 


f =coeo+-:-+cn_-1EN-1 and ene = Ch ra er = Nex. (6) 


Since ep = (1,1,1,...) and e; = (1, w, w?,...), the crucial step is their zero dot product: 
1+w+w?+---=0. The eight numbers around the circle in Figure 8.5 add to zero. 
Here is the statement and proof that every pair of e’s is orthogonal: 


If z¥ =1 andz £1,thenthesum S = 1+ 2+ 27+---+2N-! is zero. (7) 


Proof. Multiply S times z. This gives Sz = z+ 274 22 +---+ 2%. Since 2% = 1, 


+ Z 
S times z has all the same terms as the original sum S. Then Sz = S. Therefore S = 0. 
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Every dot product €f e,, is exactly our sum S. The number z is ww”. 


(it Wess) ae) Se ee es 8 (8) 


The Nth power of z = w*w” is 2 = (w)* (w®)” = (1)(1). Therefore S = 0. 


: STs : ; a 
Conclusion When we multiply F times F,, the diagonal entries are €f e, = N (because 


this is a sum of N ones). Off the diagonal we have k # n and efen = 0. Therefore 
Fir = NI. This confirms that the inverse of the Fourier matrix is F—1 = iF, 


Note 1. Your eye sees right away that the 8 numbers around the circle add to zero. 
Each number cancels its opposite number: 1 + w* is zero, w + w® is zero, w? + w® 
is zero, w® + w’ is zero. But this proof won’t work for N = 7 or 5 or 3. We can’t pair off 
the points when N is odd. They still add to zero by equation (8). 


Note 2. A cool proof of orthogonality is to see the vectors €9,...,eN—1 as eigenvectors 
of a symmetric matrix. Every symmetric matrix has orthogonal eigenvectors. Problem 14 
will choose a suitable matrix (it is a circulant matrix) and pursue this idea. 


Here are the components of f = Fc and c = F~'f : Discrete Fourier Transform 
: fe 1 = 
fxg=epe= >) wike, Ce = eet = DO Si (9) 


The symmetry of transform and inverse transform is beautiful. We didn’t see this so 
clearly for Fourier series, where c was a vector but f was a periodic function. The ele- 
gant symmetry reappears when the transform is between function f(x) and function c(k) : 


Fourier os 1 oe 
Integral c(k) = / f(zje"** dz f(x) =— / c(k) et dk. (10) 
Transform Ets ea 


Everybody notices e~*** and e***. Be sure to notice dx and dk. The functions f(x) and 

c(k) are defined for —oo < x < oo and —co < k < oo. The transform connects f(z) in 

the space domain to c(k) in the frequency domain. f(x) = 6(z) transforms to c(k) = 1. 

Section 8.6 will solve —y” + y = f(x) (no boundaries !) using this integral transform. 
Two more examples of the discrete transform are cos and sin. 


Example 4 Sample cos x and sin z at 0, 7/2, 7, 37/2 to get discrete vectors cos and sin. 
Transform those vectors by F'~!. Invert their transforms by F. 


Discrete cosine and sine cos = (1,0,—1,0) and sin = (0,1,0,-—1). 
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To transform x-space to k-space, we multiply f by F~!. For N = 4, this matrix contains 


powers of w= —7. We remember to divide by N = 4: 
1 tL ell 1 1 0 0 
z la ely Fy 0 1/2 xy ~i/2 
1 ete = 1 = 
F cos = Iie ce= ise Hie eet or 0 F™ sin 0 
1 i -1 -i 0 1/2 4/2 
Multiplication by F' transforms back to cos and sin. This is exactly consistent with the 
famous formulas of Euler: cosz = (e* eo") and sing: = sale" =e 27): 


Let me also write exp for the samples (1, w, w*, w?) of e** at ¢ = 0,7/2,7,37/2. 
Then we have Euler’s great formulas for vectors : 


1 
exp = cos + isin cos = mi exp + exp) 


exp = cos — isin sin = S( exp — exp) 


One Step of the Fast Fourier Transform 


Multiplication by an N by N matrix takes N? multiplications and additions. Since the 
Fourier matrix has no zero entries, you might think it is impossible to do better. But the 
entries w’* are very special. The FFT idea is to factor F into sparse matrices. 

If you prefer to think of the summation formulas > w?*c, and $+ w* f;, each sum 
has N terms and a vector needs N sums. In summation language, the FFT idea is to rewrite 
and regroup the sums to have many fewer terms. I will try to use both languages. 


The key idea is to connect F'y with the half-size Fourier matrix Fy 2. Assume that 


N is a power of 2 (say N = 1024). We will connect Fo24 to two copies of F512. When 
N = 4, we connect Fy to two Fy’s: 


1 


e 


1 J 
Fp, 0 Call Se i? 
CG & | 


Fx = and | 


2 8 


1 


2 


1 

i 2 
42 74 
73 6 i 


— 2 
Se. &. &. 
Om w 


SS 


On the left is F’4, with no zeros. On the right is a matrix that is half zero. The work is cut 
in half. But wait, those matrices are not the same. The block matrix with F>’s is only one 
piece of the factorization of F'4. The other pieces also have many zeros : 


1 1 1 1 1 
Keyidea Fy = 1 a 11 1 . (11) 
1 —i iY 1 


The permutation matrix on the right puts co and cg (evens) ahead of c; and cz (odds). 
The middle matrix performs separate half-size transforms on those evens and odds. 
The matrix at the left combines the two half-size outputs, and it produces the correct 
full-size output f = F4c. You could multiply those three matrices to see Fy. 
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The same idea applies when N = 1024 and M = 4N = 512. The number w is 
e277/1024_ Tt is at the angle 9 = 27/1024 on the unit circle. The Fourier matrix Fyoo4 
is full of powers of w. The first stage of the FFT is the great factorization discovered by 
Cooley and Tukey (and foreshadowed in 1805 by Gauss) : 


_ | Isi2 Dsi2 F512 even-odd 
eee Ficxn = Is12 —Ds12 F5i2 permutation 2) 
Ts12 is the identity matrix. D512 is the diagonal matrix with entries (1, w,...,w®11) using 


W1024. The two copies of F512 are what we expected. They use the 512th root of unity, 
which is nothing but w512 = (wio24)?. The even-odd permutation matrix separates the 
incoming vector ¢ into ec’ = (co, C2,...; C1022) ande” = (c1, c3,---, C1023): 

Here are the algebra formulas which express this neat FFT factorization of Fiy: 


(FFT) Set MW = 3N . The components of f = F'yc are combinations of the half- 
size transforms f’ = Fyc’ and f” = Fyre”. Equation (13) shows If’ + Df” and 
If’ — Df” with numbers (wy )’ on the main diagonal of D: 

First half f; = fi t+twn)if7, 7=0,...,M-1 


! ipl : (13) 
Second half Foxe = Ff; —- (wn) F535 9 =O M1 


Thus each FFT step has three parts: split c into c’ and c”, transform them separately by 
Fy into f/ and f”, and reconstruct f from equation (13). N must be even! 


The algebra of (13) is a splitting into even numbers 2k and odd 2k + 1, with w = wy: 


N-1 M-1 M-1 
Even/Odd f;=S> w*c, = So wean + So wi Deon 4 with M = x. (14) 
0 0 0 


The even c’s go into c’ = (co,Cc2,...) and the odd c’s go into c” = (c,¢3,...). Then 


come the transforms Fyye’ and Fy,c”. The key is w}, = wy. This gives wik = wit. 
x, : - ow 
Rewrite fie= owen + (ww) So whee =f) + (wn) fh; - (15) 
For j > M, the minus sign in (13) comes from factoring out (wy) = —1. 


MATLAB easily separates even c’s from odd c’s. Then two half-size inverse transforms 
use ifft. The last step produces f from the half-size f’ and f”. 
Problem 2 shows that F' and F’—! have the same rows, in different orders. 


f' = ifft (c(0: 2: .N —2)) * N/2;% evens 


acne f" = ifft (c(1:2: N —1)) * N/2;% odds 
from N to N/2 D=w.M0:N/2—):% di ) ae 
in MATLAB Slee : ; 70 diagonal of matrix 


Pali RD a es Dea: 
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The flow graph shows c’ and c” going through the half-size F>. Those steps are called 
“butterflies,” from their shape. Then the outputs f’ and f” are combined (multiplying f ” 
by 1,7 and also by —1, —7) to produce f = F4yc. The indices 0, 1, 2, 3 are in binary. 


00 00 
Flow 
Graph 10 01 
cto f 
N=4 Ol 10 
M= 

11 11 


Figure 8.6: Flow graph from c to f for the Fast Fourier Transform with N = 4. 


This reduction from Fy to two Fyy’s almost cuts the work in half—you see the zeros in 
the matrix factorization (12). That reduction is good but not great. The full idea of the FFT 
is much more powerful. It saves much more time than 50%. 


The Full FFT by Recursion 


If you have read this far, you may have guessed what comes next. We reduced F'y to 
Fy/2. Keep going to F'y/4. The two copies of F512 lead to four copies of F256. Then 
256 leads to 128. That is recursion. It is a basic principle of many fast algorithms. 


Here is the second stage with F = F5¢ and D = diag (1, ws12,..., (ws12)*°°): 
kD F pick 0,4,8,... 
ey | Beep F pick 2,6,10,. 
0 F = deeb F pick 1,5,9,... 
ats T=) F pick: ©3;7 dd: 


Before the FFT was invented, the operation count was N? = (1024)?. This is about a 
million multiplications. I am not saying that they take a long time. The cost becomes large 
when we have many transforms to do—which is typical. Then the saving is also large: 


1 
The final count for size N = 2” is reduced from N? to a L. 


Here is the reasoning behind 5N L. There are L levels, going from N = 2” down to 
N = 1. Each level has 4N multiplications from diagonal matrices D, to reassemble the 
half-size outputs. This yields the final count 3N L, which is 4N logs N. 


Exactly the same idea gives a fast inverse transform. The matrix Fo} contains pow- 
ers of the conjugate w. We just replace w by w in the diagonal matrix D, and in formula (13). 
The fastest FFT will be adapted to the processor and cache capacity of each computer. 
For free software that automatically adjusts, we highly recommend the website fftw.org. 
This gives the “fastest Fourier transform in the west.” 
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10 


nan FF WwW WN 


= REVIEW OF THE KEY IDEAS #® 


. Multiplying coefficients c by the Fourier matrix F' adds the series f; = > weep, 

. The inverse matrix F~! = F'/N computes the coefficients cy, = )>w* f;/N. 

. The FFT splits those sums in half: = terms with powers of w?. Then recombine. 
. By recursion the FFT has log, N steps with diagonal matrices: N log2N operations. 


. The columns e; = (1, w*, w?*,...) are orthogonal, when w = e?7/% and wN = 1. 


Problem Set 8.2 
Multiply the three matrices in equation (11) and compare with F’. In which six 
entries do you need to know that i? = —1? This is (wa)? = wo. If M = N/2, 
why is (wv)™ = —-1? 


Why is row i of F the same as row N —i of F (numbered from 0 to N — 1)? 


From Problem 2, find the 4 by 4 permutation matrix P so that F = PF. Check that 
P? =I so that P = P~!. Then from FF = 4I show that F? = 4P. 


It is amazing that F* = 16P? = 16J. Four transforms of any c bring back 16 c. 
For all N, F?/N is a permutation matrix P and F* = N?J. 


Invert the three factors in equation (11) to find a fast factorization of F~?. 
F is symmetric. Transpose equation (11) to find a new Fast Fourier Transform. 


All entries in the factorization of Ff involve powers of w = sixth root of 1: 


|r bl” ig | 


Write down these factors with 1, w, w? in D and powers of w? in F'3. Multiply! 


Put the vector c = (1,0, 1,0) through the three steps of the FFT to find y = F'c. Do 
the same for c = (0, 1,0, 1). 


Compute y = Fc by the three FFT steps for c = (1,0,1,0,1,0,1,0). Repeat the 
computation for c = (0,1,0,1,0,1,0,1). 


If w = e?7*/%4 then w? and \/w are among the and roots of 1. 


Fis asymmetric matrix. Its eigenvalues aren’t real. How is this possible ? 
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The three great symmetric tridiagonal matrices of applied mathematics are K, B, C. 
The eigenvectors of K, B, and C are discrete sines, cosines, and exponentials. The eigen- 
vector matrices give the DST, DCT, and DFT — discrete transforms for signal processing. 
Notice that diagonals of the circulant matrix C' loop around to the far corners. 


11 


12 


13 


14 


15 


2 -1 1 -1 
i en —1 2 -1 Be —1 2 -1 
-—1 2 —1 4 
2 -1 - —1 Ky, = Knn =2 
oa ae B= Bund 
=f £20) ~ Cet 


The eigenvectors of Ky and By are the discrete sines s1, ..., $y and the discrete 
cosines Co, ..., Cv—1. Notice the eigenvector co = (1,1,...,1). Here are s, and 
Cx—these vectors are samples of sin kx and cos kx from 0 to 7. 


(3 he telson : i ( 1k 37k eo) 
sin in ——— } and COs COON ce oar 


I 
Nei ee Neer Ee QN” aN 


For 2 by 2 matrices K2 and Bg, verify that s;, sz and co, c; are eigenvectors. 


Show that C3 has eigenvalues 1 = 0,3,3 with eigenvectors eg = (1,1,1), 
e, = (1,w,w?), e2 = (1,w?,w*). You may prefer the real eigenvectors (1, 1, 1) 
and (1,0, —1) and (1, —2,1). 


Multiply to see the eigenvectors e, and eigenvalues A, of Cy. Simplify to A, = 
2 — 2 cos(27k/N). Explain why Cy is only semidefinite. It is not positive definite. 


2 -1 —1 1 1 


The eigenvectors e, of C’ are automatically perpendicular because C' is a 
matrix. (To tell the truth, C has repeated eigenvalues as in Problem 12. There was 
a plane of eigenvectors for \ = 3 and we chose orthogonal e, and e2 in that plane.) 


Write the 2 eigenvalues for K2 and the 3 eigenvalues for B3. Always Ky and By +, 
have the same N eigenvalues, with the extra eigenvalue for By +1. (This is 
because K = ATA and B = AA".) 
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8.3 The Heat Equation 


The first partial differential equation in this book was ugz + Uyy = 0 (Laplace’s equation). 
This describes a steady state—time is not involved. There is no growth or oscillation or 
decay. The problem includes boundary conditions on u(z, y), but not initial conditions. 
This is like a matrix equation Au = b (where b comes from boundary conditions). 

Now we move to the heat equation uz; = Uzx. Time is very much involved. We think 
of u as the temperature along a bar at time t. We are given the initial temperature u(0, x) 
at time t = 0 and at each position x. Then heat begins to flow (from positions with higher 
temperature to neighbors at lower temperature). This is like a matrix equation u’ = Au 
with an initial condition u(0). Au is now the second derivative uz,. 

We have a PDE and not an ODE, a partial and not an ordinary differential equation, 
because the temperature wu is a function of both x and t. 


Example 1 (/nfinite bar) Suppose the bar goes from x = —oo to f = oo. At time 
t = 0, the temperature is u = —1 on the left side x < 0 and u = 1 on the right side x > 0. 
Heat will flow from the right side to the left side. The temperature along the left half 
will go up from u = —1. The right half will go down from wu = 1. Solved in Example 6. 


Example 2. (Finite bar) Suppose the bar goes from x = 0 to x = 1. The initial 
condition u(0,xz) = 1 tells us the (constant) temperature along the bar at time t = 0. We 
also need boundary conditions like u(t,0) = 0 and u(t,1) = 0 at the ends of the bar. 
Then the ends stay at zero temperature for all time ¢ > 0. 

Heat will flow out the ends. Imagine a bar in a freezer, with the sides coated. Heat 
escapes only at = 0 and z = 1. We solve the heat equation to find the temperature 
u(t, x) at every position 0 < x < 1 andevery time t > 0. 


Ou O7u 


Heat equation at pee with u(0,z) =1 and u(t,0) = u(t, 1) =0. (1) 


A good form for the solution is a Fourier series. It is natural to choose a sine series, since 
every basis function sink7z is zero atx = 0 and x = 1—exactly what the boundary 
conditions require: zero temperature at the ends of the bar. 

The initial value u(0,2) and the differential equation u; = uzz will have to tell us the 
coefficients b;(t), bo(t),... in the Fourier sine series. Heat escapes and b;,,(t) — 0. 


Solution plan The equation uz; = uz looks different from du/dt = Au, but it’s not. 
The solution still combines the eigenvectors. The pieces for the ODE were ce*“x. The pieces 
for the PDE are be*’ sin kia. 


1. Eigenvectors of A change to eigenfunctions of the second derivative: (sinkrx)"” = 
—k? x? sin kre. 

2. u(0) = c1a@1 + c2@2+--- changes to u(0,2) = bi sinaa +bosin27a +--+ (with 
infinitely many b’s) 


3. The solution (7) adds up b,e>** sin kaa. It is an infinite Fourier series. 


Infinity could make the problem difficult, but the sin kz are orthogonal. Problem solved. 
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Solution by Fourier Series 
Everything comes from choosing the right form for the solution u(t, x). Here it is: 


[oe) 
Sine series u(t, x) = bi(t) sina + ba(t) sinQaz+---= 5° b,(t)sinkwx. (2) 
k=1 


This form shows separation of variables. Functions b,(¢) depending on ¢ multiply 
functions sinkawa depending on x. When we substitute that product b,(t) sinkaz 
into the heat equation, we get a differential equation for each of the coefficients b, : 


3 oe 7 . Oby, : 2 2 : 
7y (Ok sinkrx) = Ban (Oe sinkrx) gives a sinkrx = —k*rb, sinkmz. (3) 
Then b;’ = —k?72?b,. Solving this equation will produce every bx (t) from b;,(0) : 
Decay comes from e** by (t) = e~***"tb, (0). (4) 


Final step : The starting values b;,(0) are decided by the initial condition u(0,z) = 1: 


8 


Att=0 u(0, 2) = 6, (0) sinkra =1 forO<a2<1. (5) 


> 
ll 


ai 


This is an ordinary Fourier series question: What are the coefficients of a square wave 
SW (x)? Sines are odd functions, sin(—x) = —sinx. The series in (5) must add to —1 
for x between —1 and 0. So the square wave jumps from —1 to 1. It is negative on half of 
the interval and positive on the other half: 


-1 for -l<a2<0 4 /sinnz  sin3rz 
sw(a) ={ 1 for oceci taal f°" 2 ten), (6) 
The even coefficients bj, b4,... are all zero. The odd coefficients are b, = 4/ak. Those 
b’s were computed in Section 8.1, as the first example of a Fourier series. Now these 
numbers are giving the coefficients b,(0) at t = 0. Then the equation by’ = —k?1b;, 


tells us the coefficients e~*”** tb, (0) at all future times t > 0: 
Co 4 
Solution u(t, z) = ys en kent b, (0) sinkra = = (e-**tsin TL +: -) (7) 


This completes the solution of the heat equation. The heat drops off quickly! Those are 
powerful exponentials e~™t and e~9"’*. The bar will feel extremely cold when ¢ = 1. 


Note The correct heat equation should be wz = CUzez with a diffusion constant c. 
Otherwise the equation is dimensionally wrong. The units of c are (distance)*/time, 
in order to balance uz with uz,. Then c is large for metals—heat flows easily—compared 
to its value for water or air. The factor c enters the eigenvalues —ck?7. 
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The heat equation is also the diffusion equation. A smokestack is almost a point source 
(a delta function). The smoke spreads out (diffuses into the air). This would involve two 
space dimensions x and y, or even x, y, z. The PDE could become uz = C(Uae + Uyy)- 


Summary We had a boundary value problem in x, and an initial value problem int: 


1. The basis functions S;, = sin kara depend on x. They solve uz, = Au. 


2. The coefficients b, depend on ¢. They solve b’ = Xb with b(0) coming from u(0). 


The basis functions S;,() satisfy the boundary conditions. 
Their coefficients b;,(t) satisfy the initial conditions : 


Separation at t = 0 u(0, 2) = S> by (0) S,(a) (8) 
The PDE for u(t, x) gives an ODE for each coefficient b;,(t). Here are three more bars. 


Example 3 (/nsulated bar) No heat escapes from the ends of the bar. The boundary 
conditions change to 0u/Ox = 0 at those ends. The basis functions change to cosines. 
The series (8) becomes a Fourier cosine series. 


Initial condition u(0,2) = > a,(0)coskrax 

Equation for thea, da,/dt = —k?1a, fork =0,1,2,... 
Notice that k = 0 is included. The first basis function is cosO7xz = 1. Its coefficient 
is controlled by dag/dt = 0. Thus & = 0 contributes a constant ao to the solution w(t, x). 


The temperature approaches this constant everywhere along the bar, since a1, @2,@3,... all 
die out exponentially fast. 


Example 4 (Circular bar) Now sines and cosines are both included. The basis functions 
can also be complex exponentials e*”. Again u goes to a constant steady state co : 


= ; de 
u(t, 2) = Does and = = —k? xc. (9) 


When you have a separated form for the pieces of u, your problem is nearly solved. 


Example 5 (Infinite bar) This problem leads to something new and important. There are 
no boundaries. All exponentials e’** (not just whole numbers k) are needed. By 
combining the solutions for —co < k < oo we can solve the heat equation starting from 
a delta function 6(z). This “heat kernel” is the key to chemical engineering. By a totally 
unexpected development it is also central to mathematical finance. The prices of stock 
options are modelled by the Black-Scholes partial differential equation. 

To solve for each separate e***, look for the right multiplier e“* : 


u = ettetk® solves uz = Uce When iw = (ik)?. (10) 
Then iwt = (ik)?t = —k?t. The solution u(t, x) has a separated form, with these pieces: 


u(t, xz) = e—*tetke solves the heat equation. It starts from u(0,x) = e***. (11) 
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The Heat Kernel U(t, x) 


The delta function 6(«) contains all exponentials e’"” in equal amounts. By superposition, 
the solution U to the heat equation starting from 6(z) will contain the solutions eK teike 
in equal amounts. Integrate e~* te**® over all k to find the heat kernel U. 


tka 


co 
The solution with U(0,2) = d(x) is U(t,x) = = / eM teike dp, (12) 
T 
—cCo 
Computing this integral is possible, but unexpected. No simple function of k has the 


derivative en kt, or close. The neat way is to start with OU/Ox. The derivative of eke 
brings the extra factor ik. Then integration by parts connects dU/dz to U: 


co 

dU 1 Kt se 1 2 ce xU 

Sete as Sey > k jeike | eee ed a retkt : | ae eee 

ie le (e )(te***) ¢ il. IG )(ae""*) dk om (13) 
—oo =Co. 

Now dU/U equals —x dx/2t. Integration gives —x?/4t and then U = ce~*’/4*, 


The total heat fw dz starts at {6(x)dx = 1. To stay at 1, we choose c = 1/V4nt. 
Then we have the “fundamental solution” for a point source. 


1 2 
Heat kernel Uy; = Uze with U(0,2) = 6(2) ae /4t Sire 


Example 6 On an infinite bar, the heat kernel (14) solves uz = Uzz starting from d(x) 
at t = 0. Now solve Example 1, which started from u = —1 for negative x and u = 1 for 
positive x. Then solve for any initial function u(0, 2). 

Here is the key idea for Example 1. The derivative of the jump from —1 to 1 atx = 0 
is du/dx = 26(a). The solution starting from 26(x) has du/da = 2U, which cancels \/4 
in (14). Then integrate 2U to undo the derivative and solve Example \ for u: 


u = Error function bf _ x2 At 
t,x) = — ENG 15 
Integral of 2U as) a fe ok 
0 


For x > 0 this solution is positive. For z < 0 it is negative (the integral in (15) goes 
backward). At x = 0 the solution stays at zero, which we expect by symmetry. I wrote 
the words “error function” because this important integral has been computed and tabulated 
to high accuracy (no simple function has the derivative e-* ), We just change the variable 
of integration from X to Y = X/ Jt, to see the standard error function: 


x x/V/4t 
1 —X?/4e 2 / -_y? ro ( Hy ) 

u=—/e dX = — fa lY = erf | ——}. 16 

, Vat J Vr J : VAt co 


The integral is a cumulative probability for a normal distribution (this is the area under a 
bell-shaped curve). Statisticians need these integrals erf (x) all the time. At z = 00 we have 
the total probability = total area under the curve = 1. 
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Finally, we can solve uz = Uzz from any starting function u(0,x). The key is to realize 
that every function of x is an integral of shifted delta functions 6(a — a): 


Every function ug (x) has / uo(a) 6(a — a) da = uo(z). (17) 


By superposition, the solution to wz = uzz must be an integral of shifted heat kernels. 


CO 


1 
Temperature attimet u(t, x) = ——— [ wolaye-@-2r1" da. (18) 


VArnt 
00) 
I have used the crucial fact that when the point source shifts by a to become 6(z — a), 
the solution also shifts by a. So I just shifted the heat kernel U, by changing zx to x — a. 
The heat equation on the whole line —oo < x < 00 is linear shift-invariant. 

The solution (18) is reduced to one infinite integral—still not simple. And for a more 
realistic finite bar, with boundary conditions at = 0 and x = 1, we have to think again. 
There will also be changes when the diffusion coefficient c in uy = (Cuz) is changing 
with z or ¢ or u. This thinking probably leads us to finite differences. 


Separation of Variables 


The basis functions sin kz are eigenfunctions. The same is true for cos kax and e**"*. 
Let me show this by substituting u = B(t) A(x) into the equation uz; = uzz. Right away 
uz gives B’ and uza gives A’. The separated variables are connected by uz = Use : 


B'(t) A(z) = B(t) A’(z) leads to ees es constant (19) 


Why a constant? Because A”/A depends only on x and B’/B depends only on t. They 
are equal, so neither one can move. Call that constant —): 


A” RB’ 

aS —) gives A = sin VA x and cosVA x Saas ee 20) 
The products BA = e~** sinWAx and BA = e~** cosVXz solve the heat equation 
for any number 4. But the boundary condition u(t,0) = O eliminates the cosines. 
Then u = 0 at = 1 requires sin /\ = 0 and \ = k?x?. Separation of variables has 
recovered the correct basis functions sin kx as eigenfunctions for A” = —)A. 


Example 7 (Smokestack problem) We backed away from the heat equation in 2 + 1 
dimensions. The solution to uz = Uge + Uy, involves three variables t, x, y. Put a 
smokestack at the center point x = y = O, and suppose there is no wind. Then nothing 
depends on the direction angle 9. Smoke will diffuse out from the center. The concentration 
depends only on the radial distance r, and we solve the radially symmetric heat equation. 
Our final solution is u(t, 7). 
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The heat equation is not quite u; = u,, because r = constant is curved (a circle). 
The correct radial equation is perfect for separation of variables u = B(t) A(r). 


du Gu 10u ; rely SS gT, 

Bt Opt + ie leads to B’(t) A(r) = B(t) (A” + 7A ). (21) 
Again B’/B =constant = —\ and B = e~“ as before. But instead of A” /A = —A, we 
have Bessel’s equation for the radial eigenfunction A(r) : 


d?A 1dA 


Basis functi A —_+-— 
asis functions A(r) Sar 


= —)A _hasa variable coefficient -. (22) 


The solutions are among the special functions that have been studied for centuries. They 
are not complex exponentials because the coefficient 1/r is not constant. Bessel replaces 
Fourier. This book can’t go all the way to solve Bessel’s equation, but see Section 6.5. 
A heat equation with symmetry led Bessel to new eigenfunctions. 


= REVIEW OF THE KEYIDEAS & 


1. The heat equation uz; = uzz is solved by e7*’nt sin kre for every k= 152... 

2. A combination of those solutions matches the initial u(0, x) to its Fourier sine series. 
3. With u, = 0 at x = 0 and 1, use cosines. With an infinite bar, use all e~* te***, 

4. The heat kernel U = e72/4t 1. /art solves U; = Uzz starting from Up = 4(z). 


5. Separation into B(t) A(x) shows that A(x) is an eigenfunction of the “x part” uzz. 


Problem Set 8.3 


1 Solve the heat equation u; = cuz, on an infinite bar with coefficient c, starting from 
u = e*® at t = 0. As in (10) the solution has the product form u = gears, 
With c in the equation, find w for each k. 


2 Solve the same equation ut = Cuz, Starting from the point source u = d(z) = 
i et*e dk/2m at t = 0. By superposition, you integrate over all k the solutions u 
in Problem 1. The result is the heat kernel as in equation (14) but adjusted for c. 


3 To solve uz = CUzz for a bar between x = O and x = 1, the basis functions are 
still sin kx (with wu = 0 at the ends). What are the eigenvalues A; that go into the 
solution > by (0) e7>** sin krx ? 


4 Following Problem 3, solve wz = Cuzz when the initial temperature is w = 1 for 
i <a< 3 (and uo = 0 on the first and last quarters of the bar). The problem is to 
find the coefficients b;(0) for that initial temperature. 
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5 


10 


11 
12 


13 


Solve the heat equation u; = Uuzz from a point source u(z,0) = d(x) with free 
boundary conditions u’(7,t) = u’(—z,t) = 0. Use the infinite cosine series 
6(x) = (1+ 2cosx + 2cos 2a + ---)/2a multiplied by time decay factors b,(t). 


(Bar from z = 0 to x = oo) Solve uw = Uzz on the positive half of an infinite bar, 
starting from the shifted delta function uo = 6(x — a) at a point x = a > 0. Here 
is a way to use the full-bar heat kernel U in (14), and still keep u = 0 at x = 0. 


Imagine a negative point source at x = —a. Solve the heat equation on the fully 
infinite bar, including both sources in up = 6(x — a) — d(x +a) att = 0. Your 
solution (a difference of heat kernels) will stay zero at the boundary x = 0 (Why ?). 
Then it must be the correct solution on the half-bar, since it started correctly. 


Check that the basis functions s; = sin (k+5) 7x are orthogonal over 0 <a <1. 
Find a formula for the coefficient B, in the Fourier series F(x) = > Bysx. 
(Multiply by s4(x) and integrate, to isolate By.) 


The basis functions sin (k + 5) are for fixed-free boundaries (u = 0 at = 0 
and u’! = 0 at x = 1). What are the basis functions for free-fixed boundaries 
(u'’ =Oatx=Oandu=Oatxr=1)? 


Suppose Uz = Use — U with boundary condition u = 0 atx = Oandz = 1. 
Find the new numbers A, in the general solution u = 5~b,(0)e7*** sin kra. 
(Previously 4, = —k??, now there is a new term in \ because of —u.) 

—ax? /4t 


Explain each step in equation (13). Solve dU/dx = —xU/2t to reach U = e 
How do the known infinite integrals fe’ dx = \/x and fudz = 1 lead to the 
factor 1/V4z7t? 


(Shift invariance) What is the solution to uw; = uz, starting from 6(a — a) att=0? 


What are basis functions A(z, y) for heat flow in a square plate, when u = 0 along 
the four sides x 0,x ly O,y 1? The heat equation is up = Use + 
Uyy. Find eigenfunctions for Azz + Ayy = AA that satisfy the boundary conditions. 


The first eigenfunction is A1; = (sin 7x) (sin zy). Find the eigenvalues X. 


Substitute U = e-*’/4 /V47t to show that this heat kernel solves Uy = Uzz. 
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Notes on a heat bath (This is the opposite problem to a hot bar in a freezer.) 
The bar is initially at U = 0. It is placed into a heat bath at the fixed temperature Up = 1. 
The boundary conditions are no longer zero and the bar will get hot. 


The difference V = U — Ug has zero boundary values, and its initial values are 
V = -1. Now the eigenfunction method (separation of variables) solves for V. The 
series in (7) is multiplied by —1 to account for V(z,0) = —1. Adding back Ug solves 


the heat bath problem: U = Ug + V = 1- u(z,?). 

Here Ug = 1 is the steady state solution at t = oo, and V is the transient solution. 
The transient starts at V = —1 and decays quickly to V = 0. 
Heat bath at one end This problem is different in another way too. The fixed 
“Dirichlet” boundary condition is replaced by the free “Neumann” condition on the slope: 
u’(1,t) = 0. Only the left end is in the heat bath. Heat flows down the metal bar and out 
at the far end, now located at x = 1. How does the solution change for fixed-free? 

Again Ug = 1 isa steady state. The boundary conditions apply to V = 1 — Ug: 


Fixed-free . : 
eigenfunctions V(0)=0 andV'(1)=0 Teadto A(x) = sin (k + 3) ma. 


Those new eigenfunctions (adjusted to A’(1) = 0) give a new product form B;,(t) Ax(z) : 


Fixed-free solution Vig) = ¥ BO) eT RHE)" m't sin (k + 2) ra. 

odd k 
All frequencies shift by 4 and multiply by 7, because A” = —.A has a free end at 
x = 1. The crucial question is: Does orthogonality still hold for these new eigenfunc- 
tions sin (k + 3) aa? The answer to Problem 7 is yes because A” = —)\A is symmetric. 


Notes on stochastic equations and models for stock prices with Brownian motion. 

A “stochastic differential equation” has a random term on the right hand side. Instead of a 
smooth forcing term q(t), or even a delta function 5(t), the models for stock prices include 
Brownian motion dW. The idea is subtle and important, and I will just write it down. A 
random step has dW = Z Vdt. Here Z has a normal Gaussian distribution with mean zero 
and variance a” = 1. But a new Z is chosen randomly at every instant. 

The step size V/At produces a random walk W (t) with wild oscillations. You could see 
a discrete random walk from W(t + At) = W(t) + Z VAt, and then let At approach zero. 
The true random walk is nowhere continuous. 

A steady return S(t) on an investment has S’ = aS. The growth is S(t) = e%S(0) 
exactly as in Chapter 1. But stock prices also respond to a stochastic part odW, where 
the number o measures the volatility of the market. This mixes ups and downs from 
Brownian motion o dW with steady growth (drift) from dS = aS dt: 


“Diffusion” and “drift” 


2 =adW +adt. 
Then the basic model for the value of a call option leads to the Black-Scholes equation. 
The solution comes by a change of variables to reach the heat equation. When they are 


buying and selling options, traders would have that solution available at all times. 
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8.4 The Wave Equation 


Heat travels with infinite speed. Waves travel with finite speed. Start both of them from a 
point source ug(x) = 6(x). Compare the solutions at time t : 


Heat equation uz = Ure ui. c) = a e-*’/4t ig a smooth function 
Wave equation uz, = c?Uze u(t, 2) = $6(@ —ct)+ $6(x + ct) has spikes 


We are starting from a big bang u = 6(x) at z = 0. Ata later time t, the bang reaches 
the two points x = ct and x = —ct. That represents travel to the right and to the left 
with velocities dx/dt = c and —c. The speed of sound in air is c = 342 meters/second. 


Notice another difference from the heat equation. After the bang passes point r = c 
at time t = 1, silence returns: 6(x — ct) = 0 when ct > «x. For the heat equation, 
temperatures like e~* /4t never return to zero. A wavefront passes by and we hear it only 
once. There is no echo or our ears would be full of sound. 

In reality the heat equation is often mixed in with the wave equation. The sound diffuses 
as it travels. Then we do hear noise forever, but not much: the intensity decays fast. 


The One-Way Wave Equation 


We begin with a problem that will be particularly clear. It is first order in time (t > 0) 
and first order in space (—00 < x < 00). The velocity is still c: 


Ou Ou j. 
One-way wave — =c— with w= uo(x) at t= 0. (1) 
Ot Ox 


One solution is u = e* +. Its time derivative 0u/Ot brings a factor c. The same will be 
true for sin(a+ct) and cos(x+ct) and any function of x+ct. The right function is u9(2+-ct) 
because this gives the correct start uo(x) at time t = 0: 


Solution to uz; = cuz u(t, xz) = uo(x + ct). | (2) 


Suppose u(x) is a step function (a wall of water). We have uo(x) = 0 for negative x and 
uo(x) = 1 for positive z. Then the dam breaks. A wall of water moves to the left with 
velocity c. At time t, the water reaches the point x = —ct where x + ct = 0. 


u=uo(et+ct)=0 for r+ct <0 


u=uo(e+ct)=1 for r+ct>0 2) 


Wall at x = —ct 
The line x + ct = Ois called a “characteristic.” The signal travels (with signal speed c) along 
that line in space-time, to tell about the jump from u = 0 tou = 1. 
For any initial function uo(x), the solution u = uo(a + ct) is a shift of the graph. 
It is a one-way wave, no change in shape. The waves from uw; = C?Uze go both ways. 
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Waves in Space 


Now we solve the wave equation 07u/0t? = c? 02u/Ox?. The three-dimensional form 
would be use = C7(Uze + Uyy + Uzz). This is the equation satisfied by light as it travels 
in empty space: a vacuum. The speed of light c is about 300 million meters per second 
(186, 000 miles/second). This is the fastest possible speed in Einstein’s relativity theory. 

The atmosphere slows down light. Positioning by GPS uses the speed c and the travel 
time to find the distance from satellite to receiver. (It includes many other extremely small 
effects.) In fact GPS is the only everyday technology I know that requires both special 
relativity and general relativity. Amazing that your cell phone can include GPS. 


The wave equation is second order in time because of 0?u/0t?. We are given the 
initial velocity vo(a) as well as the initial position uo(x). 


Att = O andall « u=uo(x) and Ou/Ot = vo(z). (4) 


Look for functions that have ui, equal to cuz. Now e*+ and e?~°* will both 
succeed. Two time derivatives produce a factor c twice (or a factor —c twice, both cases 
give c*). All functions f(x + ct) and all functions g(x — ct) satisfy the wave equation. 
The wave equation is linear, so we can combine those solutions. 


Complete solution to wiz = C? Uae u(t,a2) = f(~+ct)+g(a—ct) (5) 


Two functions f(a + ct) and g(a — ct) are exactly what we need to match two conditions 
uo and vp att = 0: 


zx 


and then ak dx = f(x) — g(a). 
0 


Position Uujle) = fle) + ge 
Velocity = vo(x) = cf'(x) — cg’(z) 


Add those equations to find 2f(a). Subtract those equations to find 2g(a). Divide by 2: 


x 


fle) = Zuol2) + 5- f vde — g(x)=Zuola)- 5 f ode 6) 
0 0) 


Then d’Alembert’s solution u to the wave equation has a wave traveling to the left with 
shape f and a wave traveling to the right with shape g: 


ee ar a ee eee Cae! {wola alle v (7) 
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Example 1 Start from rest (velocity vp = 0) with a sine wave u(x) = sinwa. That 
wave splits into two waves: 


uo (x + ct) + uo(x — ct) 


1 1 
5 =5 sin(wx + cwt) + 5 sin(wx —cwt). (8) 


ult 2) = 


The trigonometry formula sin A + sin B = 2sin ae cos aoe produces a short answer: 


u(t, z) = (sinwax)(cos cwt) Two traveling waves produce one standing wave. 


You sometimes see standing waves in the ocean. Not what a surfer wants to find. 


+ later start later 
t=n/w 


Figure 8.7: Always two traveling waves. Sometimes their sum is a standing wave. 


The Wave Equation from z = Otoxz = 1 


Now we leave infinite space-time. The waves we know best are on a finite Earth. 
They may come from a violin string, fixed at both ends. They could also be water waves 
(even a tsunami). They may be electromagnetic waves: light or X-rays or TV signals. 
Or they may be sound waves that our ears convert into words. All these waves are bringing 
information to our brains, and they are essential to life as we know it. 

Start with a violin string of length 1. The velocity c depends on the tension in the string. 
The ends at x = 0 and 1 are assumed to remain fixed: 


Boundary conditions at the ends u(t, 0) = O and u(t, 1) = 0. (9) 


If we pluck the string with our finger at time t = 0, we give a vertical displacement wo and 
a vertical velocity vo (this might be zero) : 


Initial conditions at the start u(0,x2) = uo(x) and (0, xL) = vo9(z). (10) 


If we remove our finger after time zero, waves move along the string. They are reflected 
back at the ends of the string. The sound is not a single beautiful note (it is a mixture 
of waves with many frequencies). Still a composer can include this plucking sound in a 
symphony and a guitarist uses it all the time. 
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The usual sound from violins comes from a continuous source—which is the bow. 
Now we are solving wet = Ure + f(t, x). When the violinist puts a finger on the string, 
that changes the length and it changes the frequencies. Instead of waves of length 1 we 
will have waves of length L and higher notes. 

With several strings the violinist or cellist or guitarist is producing several waves of 
different frequencies to form chords. Let me stay with one string of length 1. 


Separation of Variables 


We will use the most important method of solving partial differential equations by hand. 
The wave equation uz; = c”uzz has two variables t and x. The simplest solutions are 
functions of x multiplied by functions of t. 


If w= X(x)T(t) then wp =c?ug, is X(x)T"(t) = c?X"(x)T(t). (11) 


T” and X” are ordinary second derivatives. We can divide equation (11) by c?XT: 


Separation of variables = = ; (12) 
c 


The function T’”/T depends only on t. The function X ’/X depends only on x. So both 
functions are constant and they are equal. By writing —w? for the constant, the two separated 
equations have the right form: 


X" = —u?X X = Acoswx+ Bsinwz (13) 

TT" = —w*?T T =Ccoswct + Dsinwct (14) 
Key question: Which frequencies w are allowed? The boundary values at z = 0 and z = 
decide this perfectly. We want sines and not cosines, in order to have X(0) = 


We want frequencies that are multiples of 7 in order to have X(1) = Bsinw = 
This gives very specific frequencies w = 7, 277, 377, ... and no others. 


1 
0. 
0. 


The base frequency of the violin string is 7 and the harmonics are multiples w = nz. 
If we touch the string and reduce its length to L, we want sinwL = 0. Then the permitted 
frequencies increase to w = na/L. The notes go up the scale, separated by an octave. 

Those frequencies w also go into the time function T(t). The initial condition is T’’ = 0 
if the initial velocity is va = 0. Only the cosine survives in the time direction: 


X = Bsinnra T = Ccosnrct u = XT = DW(sinnraz)(cosnmct). (15) 


With length L, the natural frequencies in time are w = nrc/L. The wavelengths in space 
are 2L/n. The displacement of the string is a combination of solutions X (x)T(t) : 


(16) 


90 NTL nict 
u(t, v) = 52) Pn (sin rs ) (cos z i 


You see immediately that uz; = c? cx for every one of those terms, and any combination. 
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Final question: What are the numbers b,,? Those are decided by the remaining condition: 


L i 


Initial condition = u(0, 2) = uo(z) = J bn sin 
n=1 


This is a Fourier sine series! The formula for 6, comes from multiplying both sides by 
sin kra/L and integrating from 0 to L along the string. Only one term n = k survives: 


L L 
L 

[ v0l0) sinkradz = [betsin kna)*dz = 3 Ok: (18) 

) 0 

Inserting each b, into (16) completes the solution of the wave equation on0 < z < L. 


Example 2 Suppose the length is L = 3 and the initial displacement is a hat function: 


1 
uo(x) =x for O<a<1 and uo(z) = 5(3— 2) for l1<a <3. 


The integrals in (18) lead in Mathematica to by = 3/2k?x?. The decay rate is 1/k? for 
this function uo(x) with a corner. The slope drops from 1 to —4 at a = 1. The infinite 
series (16) will converge at every point in space-time to the correct solution u(t, x). 

Notice also that every piece of u splits into f + g, by the formula for sin Acos B: 


te nner « n(x — ct) 


sin “T cos Z Soe ee) 


+2sin = f(x+ct)+9(a — ct). 


We get two wave functions as always, specially chosen to fit the string length L. If the 
initial velocity vp is not zero, then the solution u(t, x) also contains sine functions of t. 


Our functions X (x) = sinnaa/L are actually eigenfunctions of the string: 
Ax = Ax becomes X” = —w*X The matrix A changes to a second derivative. 


Again linear algebra and differential equations go hand in hand. For linear equations. 


= REVIEW OF THE KEYIDEAS #® 


. The one-way wave equation wz; = Cuz is solved by u(t, x) = u(x + ct). 

. The two-way equation wz = C?Uz2 allows two waves f(a + ct) and g(x — ct). 

. At t = 0, the d’Alembert solution (7) matches uo(x) and vo(x) on the whole line. 
. The Fourier solution (16) chooses by so that u(0, 2) = uo(x) forO <a < L. 


. Separation of variables into u =X (x)T(t) gives X “= —w? X and T” = —w?c?T. 


Nn nan fF BH NO = 


. Zero boundary conditions give w = nz /L and eigenfunctions X (x2) = sinnza/L. 
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Problem Set 8.4 


Problems 1-4 are about the one-way wave equation 0u/Ot = cOu/Oz. 


1 Suppose u(0,z) = sin2z. What is the solution to us = cu,? At which times 
t1, tg, ... will the solution return to the initial condition sin 2x ? 


2 Suppose uo(z) = 4(x), a big bang at the origin of the one-dimensional universe. 
At time ¢ the bang is heard at the point x = . For uz = c?ugz the bang 
will reach the two points x = andz= ___ at time t. 

3 (a) Integrate both sides of uy = cu, from x = —oo to oo to prove that the total 


mass M = f{ udz is constant: dM/dt = 0. 


(b) Multiply by w and integrate both sides of wuz =cutz to prove that F = fu? dr 
is constant. 


4 Is the wave u(t, xz) = uo(x + ct) traveling left or right if c > 0? To solve uz = cuz 
on the halfline 0 < x < oo, why is a boundary condition u(t,0) = 0 not wanted ? 
With c < 0 and waves in the opposite direction, that condition is appropriate. 


Problems 5 - 9 are about the one-dimensional wave equation 02u/0t? = c?0?u/0z?. 


5 A “box of water” has uo(x) = 1 for —1 < x < 1. Starting with zero velocity vo(z), 
the wave equation uz = Cuzz is solved by u(t, xz) = Zuo(x + ct) + Zuo(x — ct). 
Graph this solution for small ¢ = 5c and large t = 3c. 


6 Under a flat ocean with uo(x) = 1, an earthquake produces vo(x) = 6(2). A one- 
dimensional tsunami starts moving with speed c. What is the solution (7) at time t? 


7 Separation of variables gives u(t,xz) = (sinnz)(sinnct) and three other similar 
solutions to wz~ = C?2Uzz. What are those three? Which complex functions e“**e™* 
solve the wave equation ? 


8 The 3D wave equation uz; = Ure + Uyy + Uzz becomes 1D when wu has spherical 
symmetry : u depends only on r and ¢. 


a oD, O7u O7u 2 Ou 
r=Vrt+yte and es ee Ae 


(a) Multiply by r to find (ru) = (ru)rr | Then ru is a function of r + t and r — t. 


(b) Describe the solution ru = 6(r — t — 1). This spherical sound wave has the 
radius r = att = 8. 


9 The wave equation along a bar with density p and stiffness k is (put): = (Kua )a- 


What is the velocity c in uz = c?uz2? What is w in u = sin(ax/L) coswt? 
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10 


11 


The small vibrations of a beam satisfy the fourth order equation uz, = eT ey 
Look for solutions u = X(x)T(t) and find separate equations for the functions X 
and T. Then find four solutions X (x) when T(t) = coswt. 


If that beam is clamped (u = 0 and Ou/Ox = 0 at both ends x = 0 andz = J), 
show that the frequencies w in Problem 10 must have (coswL)(coshwL) = 1. 


Problems 12 — 16 solve the wave equation with boundary conditions at z = Oand x = L. 


12 


13 


14 


15 


16 


A string plucked halfway along has uo(x) = 6(z — #) and vo(x) = 0. Find the 
Fourier coefficients 6; from equation (18). Write the first three terms of the Fourier 
series solution in (16). 


Suppose the string starts with zero velocity v9(x) from a hat function: uo(x) = 27/L 
for x < L/2 and uo(x) = 2(L — x)/L for x > L/2. Find the Fourier coefficients by, 
from (18) and the first two nonzero terms of u(t, ) in (16). 


Suppose the string starts with zero velocity vo(x) from a box function: uo(x) = 1 
for xz < L/2. Find all the 6; in the solution u = >> by sin(nza/L) cos(nmct/L). 


The boundary condition at a free end x = L is Ou/Ox = 0 instead of u = 0. 
Solve X” + w?X = 0 to find X (x) and all allowable w’s with this new condition. 
Then solve T” + w?c?T = 0 to complete the solution u = > an X (x) T(t). 


What is the solution u(t, z) on a string of length L = 2 if u(0,2) = d(x 


(x — 1)? 
The end x = 0 is fixed by u(t,0) = 0 and the end x = 2 is free: Ou/Ozx(t,0) = 


0. 
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8.5 The Laplace Transform 


When it succeeds, the Laplace transform can turn a linear differential equation into an 
algebra problem. Laplace transforms are applied to initial value problems (t > 0). 
Fourier transforms are for boundary value problems. Laplace has e~** instead of e***. 


When does this transform method succeed ? I see two desirable situations : 
1. The linear equation should have constant coefficients, as in Ay” + By’ + Cy = f(t). 
2. The driving function f(t) should have a “convenient” transform. 


Our list of good functions includes f(t) = e%* and its transform F(s) = 1/(s — a). 
Then the differential equation will tell us the transform Y (s) of the solution. The final step 
is to discover which function y(t) has this transform Y(s). Using our list of transforms 
and especially the rules for finding new transforms, this becomes a problem in algebra: 
Invert the transform Y(s) to find the solution y(t). These pages complete Section 2.6. 

Particular solutions are easy with f(t) = e?’. The method of undetermined coefficients 
taught us to look for yp(t) = Ye%’. The Laplace transform is not strictly needed when 
f(t) = e* or t” or sinwt or coswt. But for driving functions that turn on and off, 
and functions that jump or explode (step functions and delta functions and worse), 
the algebra becomes more systematic and better organized by the Laplace transform. 

Examples 1, 2, 3 with real, imaginary, and complex poles show you the key ideas. 


The Transform F'(s) 


Start with a function f(t) defined for ¢ > 0. Multiply by e~*¢ and integrate from t = 0 to 
t = oo. The result is the Laplace transform F'(s) and it depends on the exponent s: 


Laplace transform EF) =F (ey = ‘i f(t) e~** dt. (1) 


t=0 


The number s can be real or complex. The one key requirement on s is that the infi- 
nite integral in (1) must give a finite answer. Here are examples needing s > 0 and s > a. 


f)=1  F(s)= fettat = [—] ae * 2) 


See as = 1 
la-s], s—-a - 


The integral of e~ is finite when s is positive. More than that, it is finite when the real part 


of s is positive. A factor e~“* from the imaginary part iw has absolute value 1. Laplace 
transforms are defined when the real part of s exceeds some value sg. Here so = a. 


8.5. The Laplace Transform 471 


Important All functions in this section have f(t) = 0 for t < 0. They start att = 0. 


So the constant function f(t) = 1 is actually the unit step function. It jumps from 0 to 1 
at t = 0. Its derivative is the delta function 6(t); this includes the spike att = 0. In 
this way, the initial value problem y’ + y = 1 ignores all t < 0 and starts from y(0). 

You will see that the Laplace transform of that equation is sY(s) — y(0) + Y(s) =1/s. 
Then algebra gives Y(s) and the inverse Laplace transform gives y(t). 

The second example f = e% includes the first example f = 1, which has a = 0. 
Then 1/(s — a) becomes 1/s. We need Res > a to drive e“e~** to zero at t = ov. 
There are decreasing functions like f(t) = e~© that allow every complex number s. 
There are also rapidly increasing functions like f(t) = et that allow no s at all. 


For a delta function located at t = T' > 0, the integral picks out the transform e~°7 ': 


co 
FO) SOS Py BG) — [wt See ar mere (4) 
0 
To complete this group of examples (the all-star functions), a simple trick gives the 


transforms of coswt and sinwt. Write Euler’s formula e*’* = coswt + isinwt. Take the 
Laplace transform of every term: 


Linearity £ [e*] = £ [coswt] +7 £ [sin wt] 
The left side is 1/(s — iw). Multiply by (s + iw)/(s + iw) to see real and imaginary parts : 


1 s+iw Stiw s : on w 
= L [cos wt] = ee and & [sin wt] = oe (5) 


s—iwstiw s*+w? 


Exponents in f(t) are Poles in F'(s) 


Let me pause one minute, before using Laplace transforms to solve differential equations. 
We can already see the key connection between a function f(t) and its transform F’(s). 
Look at this Table of Transforms : 


f(t) 


F(s) 


Here is the important message. If f(t) includes e** then F'(s) has a “pole” at s = a. 
A pole is an isolated point a, real or complex, where the function F'(s) blows up. Some 
integer power (s — a)™ will cancel the pole and leave an “analytic” function (s — a)” F'(s). 
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An example shows this matchup of exponents in f(t) to poles in the transform Fs) : 


f(t) =e% + et + ce? + et 4 te has exponents 0, a, iw, —iw, c 


iv 2s a 1 ae something 
s—a (s—iw)(stiw) (s—c)? s(s —a)(s — iw)(s +iw)(s — cc)?” 


F(s) ="+ 


The first term 1/s has exponent 0 in f(t) and blowup at the pole s = 0. The last term 
1/(s — c)? has exponent c and double blowup (double pole) at s = c. In the middle, 
2 coswt contains two exponents iw and —iw, so the transform F’(s) has those two poles. 

At the very end you see all the pieces of Fs) tangled together in one big fraction. 
This is how F'(s) comes to us from a differential equation. Normally we must factor the 
denominator to see five separate poles at s = 0,a,iw,—iw,c. Then F'(s) splits into its 
simple pieces (called partial fractions). The inverse Laplace transform of each piece of 
F(s) gives a piece of f(t). PF2 and PF3 in Section 2.6 allowed two or three pieces. 


An engineer moves poles by changing the design. Then the exponents move. The system 
becomes more stable if their real parts become more negative. A quick accurate picture of 
stability comes from the poles of F'(s). If all those poles are in the left half of the complex 
plane, where Re s < 0, the function will decay to zero (asymptotic stability). 


The new function in this example is te’. We remember that the extra factor t appears 

in the solution y(t) when the exponent c is repeated (c is a double root of the polynomial 

2 _ 25 + c? that comes from y” — 2cy’ + c?y). The double root becomes a double pole 

in the transform, when (s — c)* shows up in the denominator of F'(s). Here is the required 
step, to confirm that the transform of f(t) = te“ is F(s) = 1/(s —c)?. 


The derivative of F(s) = / f(t)e~*tdt is — eof —tf (t)e” dt. 


Rule: If the function f(t) transforms to F'(s), then tf(t) transforms to —dF/ds. 
When this rule is applied to f(t) = e with F(s) = 1/(s—c), we learn that te“ transforms 
to dF/ds = 1/(s — c)?. 

This rule extends directly to higher powers of t in t” f(t). Each time you multiply by t, 
take the derivative of F'(s). Remember to multiply by —1: 


d?F d? 1 di 3=1 2 
f(t) — (—-1)? a eee (=) = ———__ = 
Ss 


ds? \s—c ds(s—c)? (s—c)> 


Continuing this way, the transform of t”e“ is n!/(s — c)"*!. This was the last entry in our 
Table of Transforms. In the special case c = 0, the transform of t” is n!/s”+1. 

Now we can work with any real poles c or imaginary poles tw in F'(s). Example 3 
will allow complex poles c + iw. This solves all equations Ay” + By’ + Cy = 0. 
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Transforms of Derivatives 


Differential equations involve dy/dt. We must connect the transform L[dy/dt] to Ly]. 
This step was especially easy for Fourier transforms—just multiply by 7k. For Laplace 
transforms we expect to multiply Y(s) by s to get £{dy/dt], but another term appears. 


The reason this happens is that Laplace completely ignores t < 0. The integral starts 
at t = 0 and the number y(0) is important. A good thing that y(0) enters the Laplace 
transform, because we certainly expect it to enter the solution to a differential equation. 


It is integration by parts that connects £ [dy/dt] to £ [y]. Two minus signs cancel: 
d [oe) d (oe) 
Y -s —s —s 
L I] =| pe tdt = [ vlovoe *\dt + [y(t)e le =skly]—y(0). (6) 
0 0 


This is the key fact that turns a differential equation for y(t) into an algebra problem for 
Y(s). If we repeat this step (apply it now to dy/dt), you will see the transform of the 
second derivative. Use equations (6) and (7) to transform differential equations. 


6 [Fe] = sc [4] - 20 = 21 - (09) - Lo, a 


Let me use this rule right away to solve three differential equations. The first has real poles. 
The second has imaginary poles. The third has complex poles s = —1 +7. 


Example 1 Solve y’ — y = 2e~* starting from y(0) = 1. 
Solution Take the Laplace transform of both sides. We know £ [2e~*] = 2/(s + 1): 
sk [y] — y(0) — £ [y] = 4 [2e~*]_ is the same as (s —1)¥(s)=1+ =i. 


Then algebra gives Y (s) and we split into “partial fractions” to recognize y(t). 


¥(s) 1 2 2 1 2 1 1 2 1 
’)=---- eee — = — 
s-1 (s—1)(s+1) s-1 Sil sete s—1 s+1 
The inverse transform of Y(s) is y(t) = 2et — e-* 


I always check that y(0) = 2—1 = 1 and y(t) = 2e* + e~* agrees with y + 2e~*. 
And don’t forget our usual method. A particular solution is yp = —e~‘. It has the same form 
as the driving function f(t) = e~*. The null solution is y, = Ce’. 


From Chapter2. y=Yyp+Yn=—e'+Ce’  y(0) =1 gives C=2 


Maybe the earlier method is simpler for this example? The next examples give practice 
with second order equations. The complex poles of Y(s) give oscillations e*”* in y(t). 
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Example 2 Solve the equation y” + y = $ sin 2t starting from rest: y(0) = y'(0) = 0. 
The transform of y” is s?Y(s) from (7): 


1 


s*Y(s)+Y(s) = (s? + 1)(s? +4) 


1 
AES and then Y(s) = 


Partial fractions will rewrite that transform Y (s) as 


¥(s) = 1 ai te) ae ys l/s 8) 

~ (s?+1)(s?+4) 3 (s?4+1)(s?+4) ~ 5241 5244 

We recognize those fractions as transforms of sine functions with w = 1 andw = 2: 

Solution y(t) = ; sint — < sin 2t has initial values y(0) =0 and y/(0) = 0. 

The transform of sin 2¢ is 2/(s? + 4), which explains why 1/3 becomes 1/6. 
In Chapter 2 we would have found y,,(t) and y,,(¢) to reach the same y(t) : 
Y = Up + Yn = —§ Sin 2t +c, cost + cp sint. 
Then c; = 0 because y(0) = 0, and cz = § because y'(0) = 0. Both ways are good. 
s-—1 

Example 3 "4 9y' + 2y = 0 with y(0) = y/(0) = 1 has Y(s) = —————_., 
p y y’ + 2y y(0) = y"(0) (s) Ces ee 


Then the roots of s* + 2s + 2 are the complex poles s = —1 +i. 


This Y(s) is not yet in our table. But we know the complex solutions e(-!+®* and 
e(-!~*)t. Their real and imaginary parts are e~* cos t and e~ sin t. The combination that 
has y(0) = y/(0) = lis y = e~*t cost + 2e~* sint. This must be the function y(t) that 
transforms to Y(s). 


The real and imaginary parts of e“e’“* transform to the real and imaginary 
parts of 1/(s — c — iw). Those two new transforms solve Example 3 when c = —1 
and w = 1. We can now solve every equation Ay” + By’ + Cy =0. 


ect cos wt fi a cto = 
wt transforms to e™ sin wt transforms to 


(s —c)? + w? (s —c)? + w?> 


Shifts and Step Functions and Cutoffs 


Suppose the driving function f(t) in a differential equation turns on at time T. Or suppose 
it turns off. Or it jumps to a different function. All these jumps in f(t) are realistic in 
practical problems, and they are automatically handled by the Laplace transform. 
Essentially, we need the transform of a step function. The basic example is a unit step 
that jumps from f = 0 fort < T to f = 1 fort > T. The transform is an easy integral : 


i(t) _ F(s) = [ona = [) = seal (9) 


=0 eas be 7 Ss 
T 
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A step function at T' transforming to e~*7 /s is an example of a new rule. 
The step at T is a time shift of the step at t = 0. Multiply the transform by e—*7. 


The original f(t) has the transform F'(s). The shifted function is zero until t = T, and 
then it is f(t — T). For the example of a unit step, the shifted step is zero for t < T. 


Here is the proof of the transform rule for the shifted function: multiply by e~°7. 


f(t) shifts to f(t — T) 


= —st — —s(r+T) — e-sT 
F(s) becomes e~*7 F(s) [% T)e"™ dé [ie dr=e F(s). 


The first integral has T’ < t < oo. The second integral has 0 < T < oo. The new variable 
7 = t —T shifts the lower limit on the integral back to 7 = 0, and it produces the 
all-important factor e~*”. We end with two examples that need this shift rule. 


Example 4 (Unit step function) Solve y’ — ay = H(t -—T) = { 7 ‘ ke pe \ 
The transform of every term (with y(0) = 1) will give the transform Y (s) of the solution: 
—sT 1 est 
sY(s)-1—aY(s) == ¥(4)=——= # =. G0) 
s—a (s—a)s 
The inverse transform of 1/(s — a) is e®*. Split the other fraction into two parts : 
1 1 1 1 1 
== —-—)} has inverse transform — (e%* — 1). (11) 
(s—a)s a\s-a_s a 
The factor e~*7 in (10) will shift that function in (11). The final solution is 
t 
Jump in y ‘ (t) a o 1 OE s as (12) 
Corner in y OS) eat (ec-T) 1) fort >T 
a 


The first part y = e%* has y’ = ay as required. This meets the second part correctly at 
t = T (no jump in y). Then the second part of y(t) continues with y’ = ay + 1: 

1 Le. 

Check gs Get el = ae ae SS Sag 

a G0, 
Question Could we have solved this problem without Laplace transforms? Certainly 
y = e™ solves the first part starting from y(0) = 1. This is yn since f = 0, and it reaches 
e*” at time T. Starting from there, we need to add on a particular solution yp. This yp 
will match the driving function f = 1 that begins to act att = T’: 


Yp — QYp =1 starting from yp(T) = 0. 


Eventually, and somehow, we would find the particular solution yp) = (eee?) —1)/a. 
Combined with y,, = e“, the complete solution yn + yp agrees with equation (12). 
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Example 5 Suppose the driving function f(t) = 1 turns off instead of on at time T: 


Fare wiih ene 5 
Solve y ay={ 5 t>T with y(0) = 1. 


Solution Instead of the previous H(t — T), this new driving function is 1 — H(t — T). 
The step function drops from 1 to 0. We still take the Laplace transform of every term 
in the differential equation : 


—sT 
sY(s) —1—aY(s) = transformof [1— H(t-—T)] = om 
8 S 
Solve this equation for Y (s) and begin to recognize the inverse transform: 
1 iL goes i 
Y(s)= has the new term -———— compared to (10). 
s—a (s—a)s (s—a)s (s—a)s 


The inverse transform of this new term is (e*’ — 1)/a, according to (11). Since the last term 
in Y(s) now has a minus sign, the final solution has two pieces meeting at t = T: 


(eo =) 2 (8) 1) fort > 7. 


e% + 2(e% — 1) fort <T 
1 
aes a 
That first part for £ < T would be our standard y, + yp, starting from y(0) = 1. The 
second part matches the first part at t = T’ (no jump in y). That second part simplifies to 


eat _ ea(t-T) 


livers. —— a and we verify that 4’ = ay. 


Rules for the Laplace Transform 


Part of this section is about specific functions f(t). We made a Table of Transforms F‘(s). 
The other part of the section is about rules. (This is like calculus. You learn the derivatives 
of t” and sint and cost and e’. Then you learn the product rule and quotient rule and 
chain rule.) We need a Table of Rules for the Laplace transform, when we know that 
Fs) and G(s) are the transforms of f(t) and g(t). 


Addition Rule The transform of f(t) + g(t)is F(s) + G(s) 
Shifting Rule The transform of f(t — T) is e— °T F(s) 
Derivative of f The transform of df /dt is sF(s) — f(0) 


Derivative of F The transform of tf (t) is —dF/ds 
Convolution Rule Section 8.6 will transform f(t)g(t) and invert F(s)G(s) 
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10 


11 


12 


Problem Set 8.5 


When the driving function is f(t) = 6(t), the solution starting from rest is the 
impulse response. The impulse is 5(t), the response is y(t). Transform this equation 
to find the transfer function Y (s). Invert to find the impulse response y(t). 


y” + y = 6(t) with y(0) = 0 and y/(0) =0 


(Important) Find the first derivative and second derivative of f(t) = sint fort > 0. 
Watch for a jump at t = 0 which produces a spike (delta function) in the derivative. 


Find the Laplace transform of the unit box function b(t) = {1 forO < t < 1} = 
H(t) — H(t — 1). The unit step function is H(t) in honor of Oliver Heaviside. 


If the Fourier transform of f(t) is defined by f(k) = f f(t)e~**dt and f(t) = 0 
for t < 0, what is the connection between f() and the Laplace transform F'(s) ? 


What is the Laplace transform R(s) of the standard ramp function r(t) = t? 
For t < 0 all functions are zero. The derivative of r(t) is the unit step H(t). 
Then multiplying R(s) by s gives , 


Find the Laplace transform F'(s) of each f(t), and the poles of F'(s): 
(a) f=1+t (b) f=tcosuwt (c) ff =cos(wt — 64) 
(d): -f=cos*t “(e)  fHe “ost dy of Ste~* sinwt 
Find the Laplace transform s of f(t) = next integer above t and f(t) = t d(t). 


Inverse Laplace Transform: Find the function f(t) from its transform F'(s) : 
1 stl 1 
eee ag ees Se 
(a) s— Qi (0) s?+1 (c) (s —1)(s — 2) 


(d) 1/(s?+2s+10) (e) e 8/(s—a) (f) 28 


Solve y” + y = 0 from y(0) and y/(0) by expressing Y(s) as a combination of 
s/(s? + 1) and 1/(s? + 1). Find the inverse transform y(t) from the table. 


Solve y” + 3y’+ 2y = 6 starting from y(0) = 0 and y’(0) = 1 by Laplace transform. 
Find the poles and partial fractions for Y(s) and invert to find y(t). 


Solve these initial-value problems by Laplace transform: 


(a) y’+y=el!,y(0)=8 (b) y” —y=e’, y(0)=0, y/(0)=0 
(ec) y’+y=e*,y(0)=2 (d) y”+y=6t, y(0)=0, y’(0)=0 
(e) y!—iwy=d(t),y(0)=0 (f) my"”+cy’+ky=0, y(0)=1, y’(0)=0 


The transform of e4* is (sf — A)~!. Compute that matrix (the transfer function) 
when A = {1 1; 1 1]. Compare the poles of the transform to the eigenvalues of A. 
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13 
14 


15 
16 


17 


18 


19 


20 


21 


22 


If dy/dt decays exponentially, show that sY (s) + y(0) as s —> oo. 


Transform Bessel’s time-varying equation ty” + y’ + ty = 0 using L[ty] = —dY/ds 
to find a first-order equation for Y. By separating variables or by substituting 
Y(s) = C/V1+4 s?, find the Laplace transform of the Bessel function y = Jo. 


Find the Laplace transform of a single arch of f(t) = sin tt. 
Your acceleration v’ = c(u* — v) depends on the velocity v* of the car ahead : 


(a) Find the ratio of Laplace transforms V*(s)/V(s). 
(b) If that car has v* = ¢ find your velocity v(t) starting from v(0) = 0. 


A line of cars has uf, = clun_i(t — T) — Un(t — T)] with vo(t) = coswt in front. 


(a) Find the growth factor A = 1/(1 + iwe“? /c) in oscillation vp = A"e™*. 
(b) Show that |A| < 1 and the amplitudes are safely decreasing if cl’ < 4. 
(Cc) Thc F show that |A| > 1 (dangerous) for small w. (Use sin@ < 6.) 


Human reaction time is T > 1sec and human aggressiveness is c = 0.4/sec. 
Danger is pretty close. Probably drivers adjust to be barely safe. 


For f(t) = 6(t), the transform F'(s) = 1 is the limit of transforms of tall thin box 
functions b(t). The boxes have width « — 0 and height 1/e and area 1. 


for0O<t<e 


Inside integrals, b(t) = { 5 aihenice 


} approaches 6(t). 


Find the transform B(s), depending on «. Compute the limit of B(s) as « — 0. 


The transform 1/s of the unit step function H(t) comes from the limit of the trans- 
forms of short steep ramp functions r¢e(t). These ramps have slope 1/e: 


Tre = 1 € CO) 
re Vo Compute Re(s) = preta + / e “dt. Let e + 0. 
€ 
0 € 
0 € 
In Problems 18 and 19, show that the derivative of the ramp function re(t) is the 
box function b(t). The “generalized derivative” of a step is the function. 
What is the Laplace transform of y’’(t) when you are given Y(s) and 
y(0),y"(0),¥"(0)? 
The Pontryagin maximum principle says that the optimal control is “bang-bang”— 


it only takes on the extreme values permitted by the constraints. To go from rest at x = 
0 to rest at x = 1 in minimum time, use maximum acceleration A and 
deceleration —B. At what time t do you change from the accelerator to the brake ? 
(This is the fastest driving between two red lights.) 
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8.6 Convolution (Fourier and Laplace) 


This section is about multiplication. Convolution is a different way to multiply functions. 
It is also a way to multiply vectors. The rule for vectors may look new, but actually you 
learned it in third grade. Let me start with ordinary multiplication of numbers, and build up 
to convolution of vectors and convolution of functions. 


When 112 is multiplied by 2 13, watch how we collect nine small multiplications : 


1) tle 3 Or 3D. a 
2 ih 3 / eae es 
es ne 3a 3b 3c 
se a de ob se 
De Dh A. 2a 2b 2c 
2 3 8 5 6 ee .e ee ee 


We don’t think about this pattern—it is so familiar. In our minds we are just multiplying 
112 by 213 in small steps. The new idea is to think of (1,1, 2) as a vector and (2, 1,3) as 
another vector. The convolution of those vectors is the vector (2, 3, 8, 5,6). 

I need a new symbol * for the convolution of two vectors c and d: 


Convolution of vectors c* d= (c,c1,...) * (do, di,...) = (Codo, cod1 + cido,...) 


That line ends with an important hint about c * d, if we can see it. First, every c; mul- 
tiplies every d;. (Those are the nine small multiplications.) Then the nine products are 
collected in a special way. We put cod; with c;do. The next component of c « d will be 
Codz2 + cid, + C2do. 

In the third grade multiplication, we are collecting together all the products c;d; that 
go in the 100s column. Those were 300 + 100 + 400. To express this with algebra, the 
n*’ component of c * d will be cody + C1dn—1 +--+ Cndo. These are all the products 
cid; with2 + 7 = n. 


Convolution c+ d=d*c (cx d)n = a cid; = S> Cidn_j.- (1) 
i+tj=n a 


The summation symbol allows the vectors to be infinitely long. The key point 
is that small multiplications c;d; go together when 2 + 7 = n, which is the same as 
j = n-—1. Let me show that rule again, this time for 2 + 2 + 3x? times 1 + x + 22”. 
We are collecting all the pieces that multiply each power x”. 


1+ 2a + 22? 
7 ae gene ee When we multiply polynomials, 


322 +323 + 624 we take the convolution of 
Be ate ae 28 the vectors of coefficients. 
24+2¢ +42? 


243248224523 +624 (2,1,3)*(1,1,2) = (2,3,8,5,6) 
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We will connect convolution of coefficients to multiplication of Fourier series. First, 
allow me to show one more example that collects the small multiplications c,d; in the same 
“convolution way.” That example is a matrix-vector multiplication Cd. The matrix C’ has 
the numbers co, c1,... along its diagonals and C times d is exactly the convolution c * d. 


Cd=cxd © co do 
Constant diagonals | “! “ do ci do + Co dy 
Toeplitz matrix - & dy | =| codot+eid; + c2do (2) 
Shift invariant eS «1 dy €2 dy + c; da 
c2 C2 dz 


These “convolution matrices” are the key to signal processing. In that highly active world, 
the matrix C' is a filter. The way to understand this filter is through its frequency response 
co + c1e7 ? + cge7 2”. 

We are ready to connect convolution with Fourier series and Laplace transforms. 


Multiplying f (x)g(x) is Convolution of Coefficients 


Convolution answers a question that we unavoidably ask. When )~c,e’** multiplies 
S~ dje* (call those functions f(x) and g(x)), what are the Fourier coefficients of the 
function h(x) = f(x)g(x)? The answer is certainly not c,d,. We have to multiply every 
coefficient cy, times every coefficient dj. All those small multiplications c,d; produce the 
coefficients of (S~ cpe***)(S> de®). The logic of the convolution rule has two steps : 


1. cye**” times dye“ equals c,dje""* when k +1 = n. 


2. The e’”® term in f(x)g(x) contains every product c,d; in whichl = n — k. 


The nth Fourier coefficient of (> cpe***) (D> de”) is the nth component of c * d: 


oO 


Multiply functions f, g ; _ _ 
Convolve coefficients c, d Coptiicient of fo {exes = a Ck In—k- — G) 


k=—0o 


Example 1 The “identity vector” in convolution is 6 = (...,0,0,1,0,0,...). Then 
6 « d = d for every vector d. The “identity function” is 7(z) = 1. Then i(x)g(x) = g(a) 
for every function g. The Fourier coefficients of i(z) = 1 are exactly 6. 


You see how convolution in frequency space (k - space) leads to multiplication 
in function space (a - space). This is the central idea of the convolution rule. 


Example 2. The autocorrelation of a vector c is the convolution c * c’. That vector 
c’ is the reverse of c. The components of c’ are the Fourier coefficients €_, of f(x). So 
autocorrelation c * c’ gives the Fourier coefficients of the product f(x) f(x) = |f(x)|?: 


ff =Q+e*)\(1 +e) =1e7% +24+1e cxe’ =(0,1,1)*(1,1,0) =(1,2,1). 


The autocorrelation of the box vector (0,1,1) is the hat vector (1,2,1). Box * box = hat. 
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Convolution of Functions 


The reverse question is equally important and has to be answered. If f(a) and g(x) have 
Fourier coefficients c, and d,, what function has the Fourier coefficients c,d; ? We are 
multiplying vectors in k-space. Then we have convolution f * g of functions in x-space ! 


2a 20 
Periodic Convolution (f * g)(z) =/ f(y)g(a — y)dy -/ gly) f(x — y)dy. (4) 
0 0 


Vector convolution is (c * d)n = >> cidn_;. The key is i + (nm —%) = n. Convolu- 
tion of functions has an integral instead of a sum (of course). Above all we notice that 
y + (x — y) = @. The pattern stays exactly the same when the functions are not periodic 
and the integrals go from —oo to oo: 


Infinite Convolution (f)(0)= [ fy)ge-u)dy= [ aw)fle-v)du. 8) 


For the Laplace transform, all functions are zero for t < 0. Change x and y to ¢ and T’. 


One-sided 
Laplace 


t 
FG # Qos [trate —T)dT because ae ney asa 
0 


Solving Differential Equations by Convolution 


I want to apply convolution to the main problem of this book—the solution of equations 
like y’ — ay = f(t) and y’ + y = f(x). Those are easy problems and we know the 
answers. Simplicity is good, it keeps the main point clear. Convolution will offer us a 
new way to write the solutions y(t) from Laplace and y(x) from Fourier. 

I will recall the old ways to solve the same equations. The next page has a summary 
of the outstanding examples in this book—linear equations with constant coefficients. 


Example 3 Solve the equation y’ — ay = f(t) by convolution, starting from y(0) = 0. 


Solution Take the Laplace transform of both sides, and divide to find Y(s): 


sY(s) —aY(s) = F(s) gives Y(s)= As) = G(s) F(s). (6) 


The transform F'(s) of the driving function is multiplied by the “transfer function” G(s). 
In this problem G(s) = 1/(s — a). Then y(t) is the inverse transform of Y(s) = G(s)F(s). 


The key is convolution. Multiplication in s - space becomes convolution in ¢ - space. 
This rule gives the solution y = g * f from Y = GF. Then we prove the rule. 


482 Chapter 8. Fourier and Laplace Transforms 


The inverse transform of the transfer function G(s) is the impulse response g(t). 
For the equation y’ — ay = f(t), the transfer function is G(s) = 1/(s — a) and its inverse 
transform is g(t) = e?’. Then the multiplication Y(s) = G(s)F(s) becomes a convolution 
of the impulse response e?¢ with the driving function f(t): 


t 
Suiaten te yi) =a) * f= fe pcryar |) 


L='0 


Please recognize this solution. We are integrating e~“ f(t) for the fourth time ! The central 
problem of Chapter 1 was y’ — ay = f(t) (or q(t)). There we proposed three methods. 


1. The integrating factor e~°' multiplies y’—ay= f(t). Integrate (e~“y)/=e—% f. 


2. Variation of parameters in the null solution y,, = Ce gives yp(t) = C(t) e™. 


3. Every input f(T) is multiplied by its growth factor e*(*-7). Combine the outputs. 


4. (New) The solution y(t) is the convolution of f(t) with the impulse response e%. 


The impulse response is g(t) = g * 6, when the input is the impulse f(t) = 4(t). 
The forced response is y = g * f, when the force is f(t). Always the convolution of 
the driving force f(t) with the Green’s function g(t) produces the output y(t). 


Confession I used Green’s name partly because the letter g appeared so conveniently. 
My deeper reason is to express a central idea that connects differential equations and 
matrix equations—the two themes of this book. Convolution with the impulse response 
(the Green’s function) is just like multiplication by the inverse matrix A~!. 

Here is the message that comes from AA~! = I. The vector g; in column j of A-lis 
the response to the delta vector 6; = (-,0,1,0,-) in column j of the identity matrix. 


Ag; = 6; inlinearalgebra  g’ — ag = 6(t) in differential equations 


I hope you find this helpful. The Green’s function g(t — T’) gives the response at time t¢ 
to a unit impulse at time JT. The total response at t is the integral of impulses f(T’) times 
responses g(t — T’). Compare with the solution v = A~'bto a matrix equation Av = b. 

The inverse matrix A~! gives the response at position i to a unit impulse at position j. 
The solution v = A~‘b is the sum over all j of impulses b; times those responses. 

For shift-invariant equations, the response at t to an impulse at T’ depends only on the 
elapsed time t — T. For shift-invariant matrices, the responses (A~');; depend only 
on i — j. The differential equation has constant coefficients. The Toeplitz matrix has 
constant diagonals. Here A is a difference matrix and A~! is a sum matrix. 


1 V1 by 1 by 
Av=|-1 1 v2 | = | be v=A 'b=]1 1 bo}. = (8) 
0 -l 1 U3 bs de bs 
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Example 4 (Fourier) Solve the equation —y” +y= f(x) for—co<24< oo. 


Solution This is a boundary value problem, with y = 0 at the endpoints x = —oo and 
x = oo. Take the Fourier transform of every term, so the two derivatives in y” 
become multiplications by ik: 


-y"+y=f@) -()'9+9=f0) ay = ZO =a fw. © 
In k-space, the transform f(k) is multiplied by g(k) = 1/(k2 + 1). In 2-space, the 
right side f(x) is convolved with the Green’s function g(x). That Green’s function g(x) 
is the solution when the right side f(x) is a delta function 6(2). 

To complete the solution we need g(x). The transform approach would invert 
g(k) = 1/(k? + 1). The direct approach is to solve —g” + g = 6(x). Remember that 
d(x) =0 forxz > Oandz <0: 


z>0O —g’+g=0 gives g=cie7 +e" Then g(co) = 0 requires c, = 0 
x<0O —g’+g=0 gives g=Cje7+Czye"* ~— Then g(—co) = 0 requires Cz = 0 


The action is all at zc = 0. There is no jump in the function g(x), so that Cy) = co. 
The minus sign in —g’” + g = 6(x) produces a drop of 1 in the slope g’(x) at x = 0. 
Comparing the slopes —c2e~* and Ce” at x = 0 gives C, + cg = 1. The coefficients are 
Cy =co2 = 3 and the Green’s function g(z) is found: 


g(x) = 


Compare with this second order equation in time, when Fourier changes to Laplace. 
Now we have initial values at t = 0 instead of boundary values at 7 = -too. 


[oe) 


and convolution gives y(z) = ‘h f(X)g(a — X) dX. 


—oo 


e* for x>0 


e” for «<0 


Nile IR 


Example 5 Solve the equation y”’ + y = f(t) starting from y(0) = y'(0) = 0. 
Solution Take the Laplace transform of both sides, and divide by s? + 1 to find Y(s): 


s’Y(s)+Y(s)=F(s) gives Y(s) = =o. 


= F(s)G(s). (10) 


The transfer function is G(s) = 1/(s? + 1). That is the Laplace transform of the 
impulse response (the growth factor) g(t) = sint. (Problem 8.5.2 confirms that (sin t)” 
does surprisingly produce 5(t). The slope is zero for t < 0, and (sint)’ jumps to cos0 = 1 
at t = 0.) Multiplication F'(s)G(s) corresponds to convolution f * g: 


t 
Laplace convolution y(t) = f(t) * g(t) = pr@ sin (t — T) dT. (11) 
0 


This solves Example 5 quickly—the crucial step is to be able to invert G(s) to find g(t). 
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Proof of the Convolution Rule 


We need to prove that the Laplace transform of f(t) * g(t) is F(s)G(s). Convolution 


becomes multiplication. Similarly the Fourier transform of f(a) * g(z) is F(k)G(K). 
An integral over T produces f * g, and then an integral over t gives its transform. 
The key is to reverse the order in that double integral. Integrate first with respect to t. 


| smat-nar ottdem | foe-Denae HDje"“ ar. 
t=0 T=0 T=0 t=0 


It was safe to extend the integration to T = oo, since g(t -T) = 0 for T > t. Also safe 


to insert e®? and e~*?; their product is 1. The inner integral on the right is exactly the 


Laplace transform G(s), when t — T is replaced by 7 : 


i qe We Pa = i, g(r)e"*" dr = / g(r)e dr =G(s). (12) 
t=0 T=-T T=0 


Since the inner integral is G(s), the double integral is F(s) G(s) as desired: 


G(s) f(T) e~* dT = F(s) G(s). The convolution rule is proved. 


soe 


Ze 


The same rule holds for Fourier transforms, except the integrals have —co < 2 < co 
and —co < k < oo. With those limits we don’t have or need the one-sided condition 
that g(t) = 0 fort < 0. The steps are the same and we reach the same conclusion. 


~ 


The Fourier transform of f (x) * g(x) is f (k) G(k). 


Point-Spread Functions and Deconvolution 


I must not leave the impression that convolution is only useful in solving differential equa- 
tions. The truth is, we solved those equations earlier. Our solutions now have the neat form 
y = f *g, but they were already found without convolutions. A better application is a 
telescope looking at the night sky, or a CT-scanner looking inside you. 

A telescope produces a blurred image. When the actual star is a point source, we don’t 
see that delta function. The image of 5(x, y) is a point-spread function g(a, y): the 
response to an impulse, the spreading of a point. With diffraction you see an “Airy disk” 
at the center. The radius of this disk gives the limit of resolution for a telescope. 


When the star is shifted, the image is shifted. The source 6(x — 20, y — yo) produces 
the image g(x — x0, y — Yo). It is bright at the location xo, yo of the star, and g gets dark 
quickly away from that point. The image of the whole sky is an integral of blurred points. 

The true brightness of the night sky is given by a function f(x,y). The image we 
see is the convolution c = f * g. But if we do know the blurring function g(z, y), 


485 


8.6. Convolution (Fourier and Laplace) 


deconvolution will bring back f(x,y) from f * g. In transform space, the scanner 
multiplies by G and the post-processor divides by G. Here is deconvolution: 


Cc 
c= f*g transformsto C = FG. The inverse transform of F = G gives f. 


The manufacturer knows the point-spread function g and its Fourier transform G. The 
telescope or the CT-scanner comes equipped with a code for deconvolution. Transform the 
blurred output c to C, divide by G, and invert F' = C/G to find the true source function f. 


Note that two-dimensional functions f(x, y) have two-dimensional transforms f(k, /). 
The Fourier basis functions of x and y are e***e”¥ with two frequencies k and 1. 


Cyclic Convolution and the DFT 


The Discrete Fourier Transform connects c = (co,...,cn—1) to f = (fo,---, fn—1). 
The Fourier matrix gives Fc = f. Computations are fast, because all the vectors 
are N-dimensional and the FFT is available. A convolution rule will lead directly to 
fast multiplication and fast algorithms. This is convolution in practice. 

The rule has to change from c * d = (1,1,2) * (2,1,3) = (2,3,8,5,6). When 
the inputs c and d have N components, their cyclic convolution also has N components. 
The new symbol in (1,1,2) ® (2,1,3) = (7,9,8) indicates “cyclic” by a circle in ®. 

The key is that w? = 1. Cyclic convolution folds 5w* + 6w* back into 5 + 6w. 


(1+ lw + 2w?)(2 + 1w + 3w?) = 2+ 3w + 8w? + 5w* + 6w* = 7+ 9w + Bu”. 


In the same way, (0,1,0) ® (0,0,1) = (1,0,0) because w times w? equals w* = 1. 
I will use this example to test the cyclic convolution rule. 
Cyclic convolution rule for the N-point transform 


The kth component of F(c@d) is (Fc), times (Fd),. That word “times” means: 
Multiply 1,w,w? from Fe and 1,w?,w* from Fd to get 1, w?,w®, which is 1,1, 1. 


11 ii1 0 1 0 1 1 1 
F=/1w w?| FJ1|=] w |timesF|]0] = |w?|] is F}O] =] 1 
1 w? w4 0 w 1 w* 0 1 


The convolution c ® d has N? small multiplications. Component by component 
multiplication of two vectors only needs N. So the convolution rule gives 
a fast way to multiply two very long N-digit numbers (as in the prime factors that 
banks use for security). When you multiply the numbers, you are convolving those digits. 


Transform the numbers to f and g. Multiply transforms by /;,9,. Transform back. 
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When the cost of these three discrete transforms is included, the FFT saves the day: 


Go to k-space, multiply, go back +N? multiplications are reduced to N+3N log N. 
In MATLAB, component-by-component multiplication is indicated by f. * g (point-star). 


F(c @ d) = (Fc).*(Fd) ifft (c @ d) = N « ifft (c).xifft (d) (13) 
Note that the fft command transforms f to c using @ = e~27*/N and the matrix F. 
The ifft command inverts that transform using w = e27*/% and the Fourier matrix F. 


The factor N appears in equation (13) because FF = NI. 


Circulant Matrices 


Multiplication by an infinite constant-diagonal matrix gives an infinite convolution. When 
row n of C’,, multiplies d, this adds up the small multiplications c;d; with i + j =n: 


a) e e e 
Infinite : Co C€-1 C-2 ¢@ do 
convolution Cood= |] C1 Co C-1 C-2 d, | =cxd. (14) 
cz Cy co Ca dy 
e c2 Cy Co ® 


Similarly, cyclic convolution comes from an N by N matrix. The matrix is called a 
“circulant” because every diagonal wraps around (based on w% = 1). All diagonals have 
N equal entries. The diagonal with c, is highlighted for N = 4: 


Co C3 C2 Cy do 
Cyclic convolution Ci Co C3 C2 dy i 
: ; Cd= =c@d. (15) 
Circulant matrix C2 C1 Co C3 dy 
C3 «C2. «Osi dz 


Notice how the top row produces codp + c3di + Cod2 + cid3. Those subscripts 0 + 0 
and 3+ 1 and 2 + 2 are all zero when N = 4. In this cyclic world, 2 and 2 add to 0. 
That comes from w?w? = w* = w®. 

Circulant matrices are remarkable. If you multiply circulants B and C’ you get another 
circulant. That product BC’ gives convolution with the vector b ® c. The amazing 


part is the eigenvalues from the DFT and eigenvectors from the Fourier matrix : 


The eigenvalues of C’' are the components of the discrete transform F’c 
The eigenvectors of every C are the columns of F (also the columns of F and F~!) 


We can verify two eigenvalues X = co + C1 + Cg and cg + cyW + cow? for this circulant : 


Co Ce Cj 1 1 Co C2 Cl 1 1 
Cy Co Co|| 1] =A} 1 Ci) Co C2 we} =A] w? |. (16) 
C2 Ci Co 1 1 C2 Cy Co w w 


The equation F'C' = AF is the cyclic convolution rule F'(c ® d) = (F'c).*(Fd). 
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The End of the Book 


The book is ending on a high note. Constant coefficient problems have taken a big step from 
Ay” + By' + Cy = 0. Now we have transforms (Fourier and Laplace) and convolutions. 
The discrete problems bring constant diagonal matrices. Cyclic problems bring circulants. 
Time to stop ! 

I should really say, stop and look back. The book has emphasized linear problems, be- 
cause these are the equations we can understand. It is true that life is not linear. If the input is 
multiplied by 10, the output might be multiplied by 8 or 12 and not 10. 
But in most real problems, the input is multiplied or divided by less than 1.1. 
Then a linear model replaces a curve by its tangent lines (this is the key to calculus). 
To understand applied mathematics, we need differential equations and linear algebra. 


= REVIEW OF THE KEY IDEAS & 


1. Convolution (1,2,3) * (4,5,6) is the multiplication 123 x 456 without carrying. 


2. (D> cre™™*)(S> dye™*) has (c * d)n = SY >cedn—»x as the coefficient of e*”*. 
Multiply functions <> convolve coefficients as in (1 + 2x + 3x7)(4+ 5x2 + 627). 


3. Differential equations transform to Y(s) = F(s)G(s). Then y(t) = f(t) * g(t) = 
driving force * impulse response. The impulse response g(t) is the Green’s function. 


4. Shift invariance : Constant coefficient equations and constant diagonal matrices. 


5. Circulants Cd give cyclic convolution c ® d. Multiply components (F'c).*(Fd). 


Problem Set 8.6 


1 Find the convolution v * w and also the cyclic convolution v ® w: 
(a) v = (1, 2) and w = (2,1) (b) v = (1, 2,3) and w = (4, 5,6). 


2 Compute the convolution (1,3, 1) * (2,2,3) = (a,b,c, d,e). To check your answer, 
adda+b+c+d-+e. That total should be 35 since 1+3+1=5 and 2+24+3=7 
and 5 x 7 =35. 


3 Multiply 1 + 32 + x? times 2 + 2x + 32? to find a + bx + cx? + dr? 4 ex*. 
) 


Your multiplication was the same as the convolution (1,3, 1) * (2, 2,3) in Problem 2. 
When x = 1, your multiplication shows why 1+3+1=5 times2+24+3=7 
agrees with a+b+c+d+e=35. 


4 (Deconvolution) Which vector v would you convolve with w = (1,2,3) to get 
v x w = (0,1, 2,3, 0)? Which v gives v ® w = (3, 1,2)? 
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10 


11 


12 


13 


14 


(a) For the periodic functions f(z) = 4 and g(x) = 2cosz, show that f * g is 
zero (the zero function) ! 


(b) In frequency space (k-space) you are multiplying the Fourier coefficients of 
4 and 2cosz. Those coefficients are co = 4 and dy = d_, = 1. 
Therefore every product c,d, is 


For periodic functions f = >> cpe*** and g = > d,e***, the Fourier coefficients of 
f * g are 2a7c;,dx. Test this factor 27 when f(x) = 1 and g(x) = 1 by computing 
f * g from its definition (4). 


2n 
Show by integration that the periodic convolution [{ cosxcos(t — x)dx is mcost. 
0 


In k-space you are squaring Fourier coefficients c,} = c_1 = 34 to get } and 4; 
these are the coefficients of 3 cost. The 27 in Problem 6 makes 7 cost correct. 


Explain why f * g is the same as g * f (periodic or infinite convolution). 


What 3 by 3 circulant matrix C' produces cyclic convolution with the vector 
c = (1,2,3)? Then Cd equals c @ d for every vector d. Compute c @ d for 
d= ( ry a ). 


What 2 by 2 circulant matrix C produces cyclic convolution with c = (1,1)? 
Show in four ways that this C is not invertible. Deconvolution is impossible. 


(1) Find the determinant of C. (2) Find the eigenvalues of C. 
(3) Find d so that (d= c@®diszero. (4) Fc has a zero component. 
(a) Change b(a) * 6(2 — 1) to a multiplication 6 d. Transform the box function 
b(x) = {1 for 0 < x < 1} to b(k) = f e~**dz. The shifted delta transforms to 
0 
d(k) = f d(z —1)e"***dz. 


(b) Show that your result 6 dis the transform of a shifted box function. Then 
convolution with 5(a — 1) shifts the box. 


Take the Laplace transform of these equations to find the transfer function G(s) : 
(a) Ay” + By'’+Cy = 6(t) (vb) y’-5y=d(t) (© 2y(t)—y(t-1) = d(t) 


Take the Laplace transform of y’”” = 6(t) to find Y(s). From the Transform Table 
in Section 8.5 find y(t). You will see y’” = 1 and y’” = 0. But y(t) = 0 for 
negative ¢, so your y’” is actually a unit step function and your y’” is actually 6(t). 


Solve these equations by Laplace transform to find Y(s). Invert that transform 
with the Table in Section 8.5 to recognize y(t). 


(a) y’ —6y=e%,y(0)=2 (b) y” +9y=1,y(0) = y'(0) =0. 
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22 


23 


Find the Laplace transform of the shifted step H (t— 3) that jumps from 0 to 1 att = 3. 
Solve y’ — ay = H(t — 3) with y(0) = 0 by finding the Laplace transform Y (s) and 
then its inverse transform y(t) : one part for t < 3, second part fort > 3. 


Solve y’ = 1 with y(0) = 4—a trivial question. Then solve this problem the slow 
way by finding Y(s) and inverting that transform. 


The solution y(t) is the convolution of the input f(¢) with what function g(t) ? 
(a) y’ — ay = f(t) with y(0) =3 (b) y’ — (integral of y) = f(t). 


For y’ — ay = f(t) with y(0) = 3, we could replace that initial value by adding 
36(t) to the forcing function f(t). Explain that sentence. 


What is 6(t) * 6(t) ? What is 6(¢ — 1) * d(¢ — 2) ? What is 6(t — 1) times 6(t — 2)? 
By Laplace transform, solve y’ = y with y(0) = 1 to find a very familiar y(t). 


By Fourier transform as in (9), solve —y” + y = box function b(z) on 0 < a < 1. 


There is a big difference in the solutions to y” + By’ + Cy = f(a), between the 
cases B? < 4C and B? > 4C. Solve y” + y = 6 andy” — y = 6 with y(+o0) = 0. 


(Review) Why do the constant f(t) = 1 and the unit step H(t) have the same 
Laplace transform 1/s? Answer: Because the transform does not notice 


MATRIX FACTORIZATIONS 


Pee Woe fp lower triangular LD upper triangular U 
nae ~ \ 1’s on the diagonal pivots on the diagonal 


Requirements: No row exchanges as Gaussian elimination reduces A to U. 
ae A TS Loe triangular L pivot matrix ( Epo triangular U 
1’s on the diagonal D is diagonal _ [’s on the diagonal 
Requirements: No row exchanges. The pivots in D are divided out to leave 1’s on the 
diagonal of U. If A is symmetric then U is LD? and A = LDL". 

3. PA = LU (permutation matrix P to avoid zeros in the pivot positions). 
Requirements: A is invertible. Then P,L,U are invertible. P does all of the 
row exchanges in advance, to allow normal LU. Alternative: A = DL, PU}. 

4. EA= R(mby m invertible FE) (any matrix A) = rref(A). 

Requirements: None! The reduced row echelon form R has r pivot rows and pivot 
columns. The only nonzero in a pivot column is the unit pivot. The last m — r rows 
of E are a basis for the left nullspace of A; they multiply A to give zero rows in R. 
The first r columns of E~! are a basis for the column space of A. 

5. S = CTC = (lower triangular) (upper triangular) with VD on both diagonals 
Requirements: S is symmetric and positive definite (all n pivots in D are positive). 
This Cholesky factorization C = chol(S') has CT = LVD, so CTC = LDL". 

6. A= QR = (orthonormal columns in Q) (upper triangular R). 

Requirements: A has independent columns. Those are orthogonalized in Q by the 
Gram-Schmidt or Householder process. If A is square then Q~! = QT. 
7. A=VAV~—! = (eigenvectors in V) (eigenvalues in A) (left eigenvectors in V~+). 


Requirements: A must have n linearly independent eigenvectors. 


8. S = QAQ™ = (orthogonal matrix Q) (real eigenvalue matrix A) (QT is Q~1). 


Requirements: S is real and symmetric. This is the Spectral Theorem. 
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9; 


10. 


11. 


12. 


13. 


14. 


15. 


A= MJM~—! = (generalized eigenvectors in M) (Jordan blocks in J) (M~?). 
Requirements: A is any square matrix. This Jordan form J has a block for each 
independent eigenvector of A. Every block has only one eigenvalue. 


orthogonal ) ( m x n singular value matrix ) ( orthogonal ) 


= pee 
A=UxXV (ee 01,-++, 0, On its diagonal Visnxn 


Requirements: None. This singular value decomposition (SVD) has the eigenvectors 
of AAT in U and eigenvectors of AT A in V; 0; = \/Ai(ATA) = V/Ai(AAT). 


At=vetut = eee) ( n x m pseudoinverse of © ) Giang 


nxn 1/o1,..., 1/0, on diagonal mxm 


Requirements: None. The pseudoinverse At has A* A= projection onto row space 
of A and AAt = projection onto column space. The shortest least-squares solution 
to Ax=bis & = A*b. This solves ATA = A™D. When A is invertible: At=A™?. 
A = QH = (orthogonal matrix Q) (symmetric positive definite matrix H). 
Requirements: A is invertible. This polar decomposition has H* = ATA. The 
factor H is semidefinite if A is singular. The reverse polar decomposition A = KQ 
has K? = AA’. Both have Q = UV™ from the SVD. 

A =UAU™" = (unitary U) (eigenvalue matrix A) (U~! whichis UH = T''). 
Requirements: A is normal: A" A = AA®. Its orthonormal (and possibly complex) 
eigenvectors are the columns of U. Complex \’s unless A = A™: Hermitian case. 

A = UTU~—} = (unitary U) (triangular T with ’s on diagonal) (U~! = U®), 


Requirements: Schur triangularization of any square A. There is a matrix U with 
orthonormal columns that makes U~! AU triangular: 


F, = F 5 . Fj evened = one step of the (recursive) FFT. 
n/2 


I -D permutation 
Requirements: F;, = Fourier matrix with entries w9* where w” = 1: F,F, = nl. 
D has 1,w,...,w”/?~1 on its diagonal. For n = 2° the Fast Fourier Transform 


will compute F;,@ with only 3né = $n log, n multiplications from £ stages of D’s. 


Properties of Determinants 


The determinant of the n by n identity matrix is 1. 
2 The determinant changes sign when two rows are exchanged (sign reversal): 


The determinant is a linear function of each row separately (all other rows stay fixed). 


ta tb a b 
multiply row 1 by any number t “ae | or 4 
- / / / y 
add row 1 of A to row 1 of A’ Gia Ob) _ la bg je Fy 
c d c d c d 


Pay special attention to rules 1-3. They completely determine the number det A. 
If two rows of A are equal, then det A = 0. 
Subtracting a multiple of one row from another row leaves det A unchanged. 


£ times row 1 
from row 2 


a ere aes 
c—la d—b| 


a b 
ed 


6 A matrix with a row of zeros has det A = 0. 
7 SIf Ais triangular then det A = a11Q22°++Qnn =product of diagonal entries. 
8 If Ais singular then det A = 0. If A is invertible then det A # 0. 


Proof Elimination goes from A to U. If A is singular then U has a zero row. The rules give 
det A = det U = 0. If A is invertible then U has the pivots along its diagonal. The product 
of nonzero pivots (using rule 7) gives a nonzero determinant: 


Multiply pivots det A = +det U = + (product of the pivots). 
9 The determinant of AB is det A times det B: |AB| = |A| |B|. 
Atimes A~' AA~*=T so (det A)(det A~*) = det I = 1. 
10 The transpose A™ has the same determinant as A. 
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Index 


A 


absolute stability, 189 

absolute value, 83, 86 
acceleration, 73, 478 

accuracy, 184, 185, 190, 191 
Adams method, 192, 193 

add exponents, 9 

addition formula, 87 

adjacency matrix, 318, 320, 427 
Airy’s equation, 130 

albedo, 49 

amplitude, 75, 82, 111 

amplitude response, 34, 77 
antisymmetric, 245, 323, 352, 409 
applied mathematics, 316, 423, 487 
arrows, 156, 318 

associative law, 220 

attractor, 170, 181 

augmented matrix, 231, 259, 273, 280 
autocorrelation, 480 

autonomous, 57, 71, 157, 158, 160 
average, 436, 440 


B 


back substitution, 213, 264 
backslash, 221 

backward difference, 6, 12, 246, 415 
backward Euler, 188, 189 

bad news, 329 

balance equation, 48, 118, 316, 424 
balance of forces, 118 

bank, 12, 40, 485 

bar, 406, 408, 412, 455, 457 


basis, 285, 289, 291, 296, 338, 446, 447 


beam, 469 


beat, 128 

bell-shaped curve, 16, 190, 458 
Bernoulli equation, 61 

Bessel function, 367, 460, 478 

better notation, 113, 124, 125 

big picture, 300, 303, 306, 400 
Black-Scholes, 457 

block matrix, 231, 237, 420 

block multiplication, 226, 227 

boundary conditions, 406, 411, 431, 457 
boundary value problem, 406, 457, 470 
box, 176 

box function, 407, 439, 445, 469, 478, 488 
Brauer, 180 


Cc 


capacitance, 119 

carbon, 46 

carrying capacity, 53, 55, 61 
Castillo-Chavez, 180 

catalyst, 180 

Cayley-Hamilton theorem, 348 

cell phone, 44, 176 

center, 161, 163, 174 

centered difference, 6, 190 

chain rule, 3, 4, 368, 371 

change of variables, 365 

chaos, 155, 181 

characteristic equation, 90, 103, 108, 164 
chebfun, 405 

chemical engineering, 457 

chess matrix, 311 

Cholesky factorization, 403 
circulant matrix, 205, 449, 486, 488 
circular motion, 76, 351 
closed-loop, 64 
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closest line, 387, 393 

coefficient matrix, 199 

cofactor, 331 

column picture, 198, 206 

column rank, 275, 322 

column space, 254, 259, 278 

column-times-row, 222, 226, 429 

combination of columns, 199, 202 

combination of eigenvectors, 329, 349, 
356, 371, 374 

commute, 221, 224 

companion matrix, 164, 165, 167, 335, 
354-356, 360, 369 

competition, 53, 174 

complete graph, 427, 428 

complete solution, 1, 17, 18, 105, 106, 
203, 211, 265, 274, 276 

complex conjugate, 32, 87, 94, 379 

complex eigenvalues, 166 

complex exponential, 13, 432 

complex Fourier series, 440 

complex gain, 111 

complex impedance, 120 

complex matrix, 376 

complex numbers, 31-33, 82-89 

complex roots, 90, 163 

complex solution, 36, 38, 39, 89 

complex vector, 433 

compound interest, 12, 185 

computational mechanics, 372 

computational science, 419, 447 

concentration, 47, 180 

condition number, 401 

conductance matrix, 124, 385, 425, 426 

conjugate transpose, 377 

constant coefficients, 1, 98, 117, 432, 
470, 487 

constant diagonals, 482, 486, 487 

constant source, 20 

continuous, 154, 358 

continuous interest, 44 

convergence, 10, 196 

convex, 73 

convolution, 117, 136, 479-489 


Index 


Convolution Rule, 476, 480, 484, 485 
Cooley-Tukey, 451 

cooling (Newton’s Law), 46 
cosine series, 436 

Counting Theorem, 267, 304, 314 
Cramer’s Rule, 331 

critical damping, 96, 100, 115 
critical point, 170, 171, 182 
cubic spline, 139 

Current Law, 123, 317, 318 

cyclic convolution, 485-487 


D 


d’ Alembert, 464, 467 

damped frequency, 99, 105, 113 

damped gain, 113 

damping, 96, 112, 118, 122 

damping ratio, 99, 113, 114 

dashpot, 118 

data, 401, 431 

decay rate, 46, 437, 444, 456, 467 

deconvolution, 485, 487 

degree matrix, 318, 427, 429 

delta function, 23, 28, 78, 97, 98, 407, 
438, 439, 442, 458, 471 

delta vector, 415, 447, 482 

dependent, 288 

dependent columns, 209 

derivative rule, 141, 441, 476 

determinant, 175, 228, 232, 326, 330, 
332, 336, 347, 353, 402, 492 

DFT, 432, 446, 449, 454, 485 

diagonal matrix, 229, 398 

diagonalizable, 363, 382 

difference equation, 45, 52, 184, 188, 338 

difference matrix, 240, 314, 405, 423 

differential equation, 1, 40, 349 

diffusion, 358, 456, 457 

diagonalization, 337, 400 

dimension, 44, 52, 267, 285, 291-293, 
304, 322 

dimensionless, 34, 99, 113, 124 

direction field, 157 

Discrete Cosine Transform (DCT), 454 

Discrete Fourier Transform, (see DFT) 


Index 


discrete sines, 405, 432, 454 
displacements, 124 
distributive law, 220 
divergence, 417 

dot product, 201, 214, 248, 377 
double angle, 84 

double pole, 145, 472 

double root, 91, 92, 101 
doublet, 151 

doubling time, 46, 47 

driving function, 77, 112, 476 
dropoff curve, 57, 62, 157 


E 


echelon matrix, 263, 266, 267 

edge, 313, 423 

eigenfunction, 408, 421, 455, 459, 467 
eigenvalue, 164, 325, 326, 382 
eigenvalue matrix, 337 

eigenvector, 167, 325, 326, 382 
eigenvector matrix, 337, 363 
Einstein, 464 

elapsed time, 98 

elimination, 210, 212, 334 
elimination matrix, 224, 229, 303 
empty set, 293 

energy, 396, 397, 409, 411, 424, 443 
energy balance, 48 

energy identity, 440, 444 

enzyme, 180 

epidemic, 179, 180 

equal roots, 90, 92, 100 
equilibrium, 417 

error, 185, 186, 191, 193 

error function, 458 

error vector, 386, 394 

Euler, 317 

Euler equations, 176, 183 

Euler’s Formula, 13, 82, 83, 450 
Euler’s method, 185, 186, 189, 384 
even permutation, 246 

exact equations, 65 

existence, 154, 196 

exponential, 2, 7, 10, 25, 131, 362, 369 
exponential response, 104, 108, 117 
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F 


factorization, 382, 490 

farad, 122 

Fast Fourier Transform, (see FFT) 

feedback, 64 

FFT, 88, 432, 446, 447, 450, 451 

fftw, 452 

Fibonacci, 340, 345, 405 

filter, 480 

finite elements, 124, 373, 419, 430 

finite speed, 463 

first order, 164 

flow graph, 452 

football, 176, 178 

force balance, 426 

forced oscillation, 80, 105, 110 

forward differences, 240 

Four Fundamental Subspaces, 300, 303 

Fourier coefficients, 435-437, 440 

Fourier cosine series, 457 

Fourier Integral Transform, 449 

Fourier matrix, 85, 243, 446-448, 450 

Fourier series, 419, 436, 439, 443, 455 

Fourier sine series, 410, 434, 467 

fourth order, 80, 93, 469 

foxes, 172, 174 

free column, 262 

free variable, 262, 266, 269, 270, 274 

free-free boundary conditions, 412 

frequency, 31, 76, 79, 373, 466 

frequency domain, 120, 145, 449, 480 

frequency response, 36, 77, 432 

frisbee, 176 

full rank, 275-277, 281, 287, 385 

function space, 293, 298, 433, 440, 480 

fundamental matrix, 366, 371, 384 

fundamental solution, 78, 81, 97, 117, 458 

Fundamental Theorem, 5, 8, 42, 244, 
304, 307, 400 


G 


gain, 30, 33, 84, 104, 111 
Gauss-Jordan, 230-232, 236, 283, 331 
gene, 431 
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general solution, 280 

generalized eigenvalues, 372 
geometric series, 7 

Gibbs phenomenon, 435, 436 

gold, 153 

Gompertz equation, 63 

Google, 328 

GPS, 464 

gradient, 417, 421 

graph, 313, 317, 318, 320, 416, 423 
graph Laplacian, 316, 318, 423 
Green’s function, 136, 482, 483 
greenhouse effect, 49 

grid, 416, 419, 429 

ground a node, 424, 426 

growth factor, 24, 40-42, 51, 97, 135, 482 
growth rate, 2, 40, 364 


H 


Hénon map, 181 

Hadamard matrix, 243, 344 
half-life, 46 

harmonic motion, 75, 76, 79 
harvesting, 59, 60, 62 

hat function, 467 

heat equation, 410, 455, 456 
heat kernel, 457, 458, 460 
Heaviside, 21, 477 

Henry, 122 

Hermitian matrix, 377 
Hertz, 76 

higher order, 93, 102, 105, 107, 117, 355 
Hilbert space, 433 
homogeneous, 17, 103 
Hooke’s Law, 74, 374, 424 
hyperplane, 207 


identity matrix, 201, 219 
image, 484 

imaginary eigenvalues, 331, 351 
impedance, 39, 120, 121, 127 
implicit, 67, 188 

impulse, 23, 78 
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impulse response, 23, 24, 78, 97, 102, 
117, 121, 136, 140, 150, 482 

incidence matrix, 124, 313, 317, 320, 423 

independence, 204 

independent columns, 273, 276, 290, 
322, 385, 391 

independent eigenvectors, 362 

independent rows, 273 

inductance, 119 

infection rate, 179 

infinite series, 10, 13, 329, 369, 434, 455 

inflection point, 54, 55 

initial conditions, 2, 40, 73, 349, 457 

initial values, 470, 483 

inner product, 226, 323, 377, 409, 433 

instability, 193 

integrating factor, 19, 26, 41, 482 

integration by parts, 248, 323, 409, 413, 431 

interest rate, 12, 43, 485 

intersection, 201, 258, 299 

inverse matrix, 31, 228, 231, 482 

inverse transform, 140, 446, 473, 477 

invertible, 205, 213, 228, 290 

isocline, 156, 159, 160 


J 


Jacobian matrix, 171, 177 
Jordan form, 357, 382, 383 
Julia, 330 

jump, 21, 474, 475 


K 


key formula, 8, 19, 78, 112, 117, 135, 482 
kinetic energy, 79 

Kirchhoff’s Current Law, 316, 424 
Kirchhoff’s Laws, 123, 272 

Kirchhoff’s Voltage Law, 315 

KKT matrix, 428 

kron (A, B), 420 


L 

l’H6pital’s Rule, 43, 109 
LAPACK, 242, 332 

Laplace convolution, 481, 483 
Laplace equation, 416, 417 


Index 


Laplace transform, 121, 141-151, 470-478 
Laplace’s equation, 418, 442, 443 
Laplacian matrix, 318, 320, 424 

law of mass action, 180 

least squares, 385-387 

left eigenvectors, 348 

left nullspace, 300, 302 

left-inverse, 228, 232, 242 

length, 242 

Liénard, 182 

linear combination, 199, 201, 254, 288 
linear equation, 4, 17, 105, 134, 177, 349 
linear shift-invariant, 459 

linear time-invariant (LTT), 71, 349 
linear transformation, 209 

linearity, 221, 471 

linearization, 172-179 

linearly independent, 277, 287, 289 
lobster trap, 159 

logistic equation, 47, 53, 62, 157, 190 
loop, 315-317 

loop equation, 119, 123, 127 

Lorenz equation, ix, 154, 181 
Lotka-Volterra, 173 


magic matrix, 209 

magnitude, 112 

magnitude response, 34, 77 

Markov matrix, 327, 329, 333, 382 
mass action, 180 

mass matrix, 372, 381 

Mathematica, 194, 467 

mathematical finance, 457 

MATLAB, 191, 332, 372, 447, 451, 486 


The single heading “Matrix” indexes 
the active life of linear algebra. 


Matrix 


—1,2,—1, 246, 415, 454 
adjacency, 318 
antisymmetric, 352, 376 
augmented, 230, 271, 278 
circulant, 486, 488 
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companion, 164, 355, 360 

complex, 376 

difference, 240, 314, 405, 422, 

echelon, 266 

eigenvalue, 337 

eigenvector, 337, 363 

elimination, 224, 229, 303 

exponential, 14, 362, 368 

factorizations, 382, 490 

Fourier, 85, 243, 446, 447, 450 

fundamental, 366 

Hadamard, 243, 344 

Hermitian, 377 

identity, 201, 219 

incidence, 124, 313, 314, 317, 423 

inverse, 228, 231 

invertible, 204, 213, 231, 290 

Jacobian, 171, 177 

KKT, 428 

Laplacian, 318, 320, 424 

Markov, 327, 333 

orthogonal, 238, 247, 376 

permutation, 241, 246, 299, 450 

positive definite, 372, 385, 396 

projection 238, 242, 247, 334, 376, 
378, 382, 390, 394 

rank one, 305, 382, 404 

rectangular, 385 

reflection, 247 

rotation, 331 

saddle-point, 428, 430 

second difference, 414 

semidefinite, 398, 412, 413 

similar, 365, 370, 383 

singular, 202, 326, 328, 492 

skew-symmetric, 382 

sparse, 223 

stable, 352 

stiffness, 124, 372, 385 

symmetric, 238, 375, 409 

Toeplitz, 480, 482 

tridiagonal, 382, 454 

unitary, 377 
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matrix multiplication, 219-223, 249 
mean, 392, 395 

mechanics, 74 

mesh, 420 

Michaelis-Menten, 180 
minimum, 404 

model problem, 40, 115, 374, 423 
modulus, 32, 83 

multiplication, 202, 219, 479 
multiplicity, 93, 343 

multiplier, 210, 214, 225 
multistep method, 192 


natural frequency, 77, 99, 102, 466 

network, 313-323, 416, 425, 426 

neutral stability, 166, 339, 352 

Newton’s Law, 46, 73, 239, 370 

Newton’s method, 6, 181 

nodal analysis, 123 

node, 313, 423 

nondiagonalizable, 339, 342, 346, 383 

nonlinear equation, 1, 53, 172 

nonlinear oscillation, 71 

norm, 400, 401 

normal distribution, 458 

normal equations, 387, 389 

normal modes, 373 

Nth order equation, 107, 117 

null solution, 17, 18, 78, 92, 103, 106, 
113, 203 

nullity, 267 

nullspace, 261 

number of solutions, 282 


O 


ODE 45, 191, 193 

off-diagonal ratios, 227 

Ohm’s Law, 39, 122, 424, 425, 427 
one-way wave, 463, 468 

open-loop, 64 

operation count, 452 

optimal control, 478 

order of accuracy, 186, 190, 192 
orthogonal basis, 399, 433, 447, 448 


orthogonal eigenvectors, 239, 375 
orthogonal functions, 323, 405, 434 
orthogonal matrix, 238, 242, 376, 381 
orthogonal subspace, 306 
orthonormal basis, 398, 400, 440 
orthonormal columns, 242, 397 
oscillation, 74, 75 

oscillation equation, 372 
overdamping, 96, 100, 102 
overshoot (Gibbs), 435, 436 


P 


PF2, 62, 142, 149, 472 

PF3, 143, 149, 472 

parabolas, 91, 96 

parallel, 122, 127 

partial differential equation, (see PDE) 
partial fractions, 56, 62, 142-149, 474 
partial sums, 438 


Index 


particular solution, 17, 18, 41, 106, 203, 


274, 276, 278 
PDE, 416, 455, 466 
peak time, 113, 128 
pendulum, 71, 81, 182 
period, 76, 163, 444 
periodic, 173 
permutation matrix, 241, 246, 299, 450 
perpendicular, 201, 243, 389, 433, 434 
perpendicular eigenvectors, 383 
perpendicular subspaces, 312 
phase angle, 32, 80 
phase lag, 30, 33, 75, 81, 112 
phase line, 170 
phase plane, 59, 351 
phase response, 77 
pictures, 153, 162 
pivot, 210, 212, 225, 233, 402 
pivot column, 262, 264, 290, 294 
pivot variable, 264, 270 
plane, 201, 207, 258 
Pluto, 155 
point source, 23, 457, 458 
point-spread function, 484 
Poisson’s equation, 417 
polar angle, 38, 83 
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polar form, 30, 32, 84, 110, 112, 121, 
244, 418, 431, 448 

poles, 100, 129, 140, 471-473 

polynomial, 131 

Pontryagin, 478 

population, 47, 55, 61, 63 

positive definite, 372, 385, 396, 403-411 

positive definite matrix, 372, 382, 396 

positive semidefinite, 412, 413 

potential energy, 79 

powers, 221, 328, 341 

practical resonance, 126 

predator-prey, 172, 174, 180 

prediction-correction, 191 

present value, 51 

principal axis, 376 

Principal Component Analysis, 401, 431 

probability, 458 

product integral, 384 

product of pivots, 330, 492 

product rule, 8 

projection, 387, 389-391, 394 

projection matrix, 247, 334, 382, 389, 394 

pulse, 392, 393 

Python, 330 


Q 


quadratic formula, 90 
quiver, 155 


R 


rabbits, 172, 174 

radians, 76 

radioactive decay, 45 

ramp function, 23, 98, 407, 408, 477 
ramp response, 129 

rank, 267, 273, 277, 301 

rank of AB, 311 

rank one matrix, 305, 382, 401 
rank theorem, 322 

Rayleigh quotient, 431 
reactance, 121 

real eigenvalues, 166, 239, 375 
real roots, 90, 162 

real solution, 31, 111 
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rectangular form, 110, 111 

rectangular matrix, 385 

recursion, 452, 453 

red lights, 478 

reflection matrix, 247, 382 

relativity, 464 

relaxation time, 46 

repeated eigenvalues, 338, 339, 355, 383 

repeated roots, 90, 92, 101, 355 

repeating ramp, 436 

resistance, 119, 426 

resonance, 26, 27, 29, 79, 82, 108, 109, 
114, 116, 132, 137, 364 

response, 77 

reverse order, 229, 238, 248 

right triangle, 129, 386 

right-inverse, 228, 232, 233 

RLC loop, 39, 118, 119, 122 

roots, 101, 108, 129 

roots of z¥ = 1, 448 

rotation matrix, 331 

row exchange, 212, 216, 242 

row picture, 198, 199, 214 

row space, 289, 323 

rref (A), 263, 265, 267, 268, 284 

Runge-Kutta, 16, 191-193 


Ss 

S-curve, 54, 64, 157 

saddle, 162, 169, 173, 177, 402, 428 

saddle-point matrix, 428, 430 

SciPy, 194 

second difference, 240, 246, 410, 414, 415 

semidefinite, 398, 412 

separable, 56, 65 

separation of variables, 421, 422, 456, 
459, 460, 466 

shift, 441 

shift invariance, 98, 459, 480, 482, 487 

shift rule for transform, 475 

sign reversal, 492 

similar matrix, 365, 370, 383 

Simpson’s Rule, 195 

sines and cosines, 439 

singular matrix, 202, 205, 218, 326, 492 
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singular value, 398, 400, 405 

Singular Value Decomposition, (see SVD) 
singular vector, 385 

sink, 17, 162 

sinusoid, 19, 30, 34 

sinusoidal identity, 35, 37, 112 

SIR model, 179 

six pictures, 162, 171 

skew-symmetric, 381 

smoothness, 437 

solution curve, 154 

Solution Page, 117 

solvable, 255, 257, 277, 311 

source, 17, 19, 40, 162 

span, 256, 260, 285, 288, 296 

sparse matrices, 223 

special inputs, 131, 139 

special solution, 261, 265, 302 

spectral theorem, 376, 383 

speed of light, 464 

spike, 23, 407, 437, 438 

spiral, 33, 86, 88, 95, 161 

spiral sink, 163 

spring, 74, 119 

square root, 397 

square wave, 435, 437, 443, 456 
stability, 49, 58-60, 187, 188 

stability limit, 190, 195 

stability line, 58, 170 

stability test, 165-170, 175, 188, 339, 353 
stable, 161, 169, 352, 472 

standing wave, 465 

starting value (initial condition), 2, 9 
state space, 127 

statistics, 401, 458 

steady state, 21, 49, 53,58, 155, 328, 357 
Stefan-Boltzmann Law, 49, 63 

step function, 21, 23, 474, 475, 478, 489 
step response, 22, 81, 97, 102, 124-128 
stepsize, 184 

stiff equation, 187 

stiff system, 193 

stiffness, 118, 468 

stiffness matrix, 124, 372, 385 
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stock prices, 457 

straight line, 386 

subspace, 251-254, 256, 258, 296 
Sudoku matrix, 209 

sum of spaces, 260 

sum of squares, 386, 388 
superposition, 8, 349, 460 

SVD, 244, 398, 382, 385, 399-405, 431 
switch, 22 

symmetric and orthogonal, 244, 378 
symmetric matrix, 238, 239, 292, 375, 409 
symmetry, 468 

system, 164, 197, 325 


T 


Table of Eigenvalues, 382 

Table of Rules, 476 

Table of Transforms, 146, 471 

tangent, 75, 80, 156 

tangent line, 6, 184 

tangent parabola, 7, 191 

Taylor series, 7, 10, 14, 16, 185 
temperature, 46, 442, 455, 459 

test grades, 395 

three steps, 341, 349, 369 

time constant, 100 

time domain, 120, 127 

time lag, 81 

time-varying, 367, 371, 384 

Toeplitz matrix, 480, 482 

Toomre, 178 

trace, 175, 331, 332, 336, 347, 353, 384 
transfer function, 104, 121, 432, 477, 481 
transient, 27, 103 

tree, 317 

triangular matrix, 213, 238, 293, 490, 492 
tridiagonal matrix, 232, 246, 382, 410, 454 
tumbling box, 176, 178, 183 


U 

underdamping, 96, 100, 102, 117 
undetermined coefficients, 117, 130-137 
uniqueness, 154, 289 

unit circle, 33, 84, 85, 94, 448 

unit vector, 334 
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unitary matrix, 377 Index of Symbols 

units, 44, 52, 456 in 

unstable, 49, 53, 166 A= LU, 414, 490 

upper triangular, 210, 213 A=QR, 490 
A=QS, 431 


Vv 


variable coefficient, 1, 42, 130 
variance, 392, 395, 401, 431 
variation of parameters, 41, 43, 130, 


A= UXV7, 382, 398, 401 
A= VAV71 337 341 
ATA, 239, 276, 312, 385, 395, 417, 423 


133-135, 138, 482 ATCA, 392, 404, 416, 425, 427 
vector, 164, 199, 200, 251, 252 Bean, AG 
vector space, 251, 252, 298, 321 
very particular, 26, 27, 117, 144 pate acu 
violin, 465, 469 K = ATCA, 410, 423, 424 
Voltage Law, 123, 317, 318 P(D), 108, 117 
voltage source, 425 Q, 238 
Ww SSE D1 2403 
wave equation, 463-466, 469 S = QAQ™, 376 
weighted Laplacian, 424 S+, 307 
weighted least squares, 390, 392 C(A) and N(A), 255, 261 
Wikipedia, 243, 431 R” and Cc” 251 


Wolfram Alpha, 194 
Wronskian, 134, 135, 366, 384 


Z 


zerocline, 157 
zeta, 99, 113 


LINEAR ALGEBRA IN A NUTSHELL 
(( The matrix A is n by n)) 


Nonsingular 


A is invertible 

The columns are independent 

The rows are independent 

The determinant is not zero 

Aa =0 has one solution « =0 

Az =b has one solution x= A~!b 
A has n (nonzero) pivots 

A has full rank r=n 


The reduced row echelon form is R=I 


The column space is all of R” 

The row space is all of R” 

All eigenvalues are nonzero 

ATA is symmetric positive definite 
A has n (positive) singular values 


Singular 


A is not invertible 

The columns are dependent 

The rows are dependent 

The determinant is zero 

Aa = 0 has infinitely many solutions 
Aa =b has no solution or infinitely many 
Ahasr < n pivots 

Ahas rank r <n 

F has at least one zero row 

The column space has dimension r < n 
The row space has dimension r < n 
Zero is an eigenvalue of A 

AT A is only semidefinite 

A has r <n singular values 
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